Help:Using archive.today: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 16:15, 24 September 2013 edit88.15.83.61 (talk)No edit summary← Previous edit		Revision as of 10:22, 25 September 2013 edit undoHellknowz (talk \| contribs)Extended confirmed users, Pending changes reviewers, Rollbackers, Template editors32,851 edits at least remove the boldface and move to the bottom of leadNext edit →
Line 1:		Line 1:
	{{mfd}}		{{mfd}}
	{{Misplaced Pages how to\|WP:ARCHIVEIS}}		{{Misplaced Pages how to\|WP:ARCHIVEIS}}

⚫	~~<b>~~Note: Archive.is, unlike other web archive sites, does not obey ]<ref>http://wiki.dandascalescu.com/reviews/online_services/web_page_archiving</ref>, which other archiving services such as WayBack Machine and WebCite use to avoid infringing on copyright. (Web sites with copyright protected content use "robot" tags to inform archives that their content is not to be re-hosted on any other site, by consensus in ] <code>robots-request@nexor.co.uk</code><ref>http://www.robotstxt.org/orig.html</ref>, but Archive.is does not honor this protocol, which is used by The WayBack Machine and ] to avoid copyright infringement<ref>Webcite
⚫	http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550686/ "Copyright issues are addressed by honouring respective Internet standards (robot exclusion files, no-cache and no-archive tags)."</ref>. (Re-hosting copyrighted material without permission is a violation of the ] (DMCA)). For this reason, to avoid implicating Misplaced Pages in violations of copyright laws and incurring DMCA take-down requests, Archive.is should not be used in Misplaced Pages articles.~~</b>~~

	This page gives information about using ], an on-demand ] service, at . By using ], Misplaced Pages editors can reduce ] by preserving a copy of an online ] that can be accessed if the original page is moved, changes, or disappears. Not all web pages can be archived, however.<ref name="wcfaq" group="nb"> A page may not be archived for a number of reasons. Archive.is does not support archiving ] files, audio and video. The page may be too big (there is 50mb limit for a single page). The content may be inaccessible from the Archive.is network (this is particularly likely if you are attempting to access subscription based content which your institution subscribes to on its users' behalf). Also, the content may be unreadable by the Archive.is archiver (too complex JavaScript based pages can crash its browser or be executed too long time, or ones involving browser checks sometimes cause our archive engine to fail).</ref>		This page gives information about using ], an on-demand ] service, at . By using ], Misplaced Pages editors can reduce ] by preserving a copy of an online ] that can be accessed if the original page is moved, changes, or disappears. Not all web pages can be archived, however.<ref name="wcfaq" group="nb"> A page may not be archived for a number of reasons. Archive.is does not support archiving ] files, audio and video. The page may be too big (there is 50mb limit for a single page). The content may be inaccessible from the Archive.is network (this is particularly likely if you are attempting to access subscription based content which your institution subscribes to on its users' behalf). Also, the content may be unreadable by the Archive.is archiver (too complex JavaScript based pages can crash its browser or be executed too long time, or ones involving browser checks sometimes cause our archive engine to fail).</ref>

	Archive.is can archive a range of content, including ] web pages, ], ], and ]s. Another web archiving services are the ] and the ]. The three operate differently, and certain pages can be archived by one but not the other. The Wayback Machine takes snapshots of webpages at certain times as well as having an archiving process initiated by user requests; WebCite requires someone to actively archive a link; ] monitors ] of many wiki projects (including all national wikipedias) in order to authomaticaly archive new links as soon as possible after the editors added them to the articles.		Archive.is can archive a range of content, including ] web pages, ], ], and ]s. Another web archiving services are the ] and the ]. The three operate differently, and certain pages can be archived by one but not the other. The Wayback Machine takes snapshots of webpages at certain times as well as having an archiving process initiated by user requests; WebCite requires someone to actively archive a link; ] monitors ] of many wiki projects (including all national wikipedias) in order to authomaticaly archive new links as soon as possible after the editors added them to the articles.

		⚫	Note: Archive.is, unlike other web archive sites, does not obey ]<ref>http://wiki.dandascalescu.com/reviews/online_services/web_page_archiving</ref>, which other archiving services such as WayBack Machine and WebCite use to avoid infringing on copyright. (Web sites with copyright protected content use "robot" tags to inform archives that their content is not to be re-hosted on any other site, by consensus in ] <code>robots-request@nexor.co.uk</code><ref>http://www.robotstxt.org/orig.html</ref>, but Archive.is does not honor this protocol, which is used by The WayBack Machine and ] to avoid copyright infringement<ref>Webcite
		⚫	http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550686/ "Copyright issues are addressed by honouring respective Internet standards (robot exclusion files, no-cache and no-archive tags)."</ref>. (Re-hosting copyrighted material without permission is a violation of the ] (DMCA)). For this reason, to avoid implicating Misplaced Pages in violations of copyright laws and incurring DMCA take-down requests, Archive.is should not be used in Misplaced Pages articles.

	==How to archive==		==How to archive==

Revision as of 10:22, 25 September 2013

This help page is being considered for deletion in accordance with Misplaced Pages's deletion policy.

Please discuss the matter at this page's entry on the Miscellany for deletion page.

You are welcome to edit this page, but please do not blank, merge, or move it, or remove this notice, while the discussion is in progress. For more information, see the Guide to deletion.%5B%5BWikipedia%3AMiscellany+for+deletion%2FHelp%3AUsing+archive.today%5D%5DMFD

Maintenance use only: Place either {{mfd}} or {{mfdx|2nd}} on the page nominated for deletion. Then subst {{subst:mfd2|pg=Help:Using archive.today|text=...}} ~~~~ to create the discussion subpage. Finally, subst {{subst:mfd3|pg=Help:Using archive.today}} into the MfD log. Please consider notifying the author(s) by placing
{{subst:MFDWarning|Help:Using archive.today}} ~~~~
on their talk page(s).

This help page is a how-to guide.
It explains concepts or processes used by the Misplaced Pages community. It is not one of Misplaced Pages's policies or guidelines, and may reflect varying levels of consensus.

Shortcut

WP:ARCHIVEIS

This page gives information about using Archive.is, an on-demand web archiving service, at http://archive.is/. By using Archive.is, Misplaced Pages editors can reduce link rot by preserving a copy of an online source that can be accessed if the original page is moved, changes, or disappears. Not all web pages can be archived, however.

Archive.is can archive a range of content, including HTML web pages, style sheets, JavaScript, and digital images. Another web archiving services are the Wayback Machine and the WebCite. The three operate differently, and certain pages can be archived by one but not the other. The Wayback Machine takes snapshots of webpages at certain times as well as having an archiving process initiated by user requests; WebCite requires someone to actively archive a link; User:RotlinkBot monitors RecentChanges of many wiki projects (including all national wikipedias) in order to authomaticaly archive new links as soon as possible after the editors added them to the articles.

Note: Archive.is, unlike other web archive sites, does not obey robots, which other archiving services such as WayBack Machine and WebCite use to avoid infringing on copyright. (Web sites with copyright protected content use "robot" tags to inform archives that their content is not to be re-hosted on any other site, by consensus in mail list robots-request@nexor.co.uk, but Archive.is does not honor this protocol, which is used by The WayBack Machine and WebCite to avoid copyright infringement. (Re-hosting copyrighted material without permission is a violation of the Digital Millennium Copyright Act (DMCA)). For this reason, to avoid implicating Misplaced Pages in violations of copyright laws and incurring DMCA take-down requests, Archive.is should not be used in Misplaced Pages articles.

How to archive

There are many ways to submit a web page to Archive.is for archiving. If you are new to using Archive.is, give the Archive.is form method a go first. The other methods are better suited to those who use Archive.is regularly.

Website form

This method is easy to use but is slower than the other methods as it requires going to the Archive.is website each time you want to archive a web page.

Go to http://archive.is.
Enter the URL of the web page you wish to archive into the "My url is alive and I want to archive its content" field (the red one).
After entering the URL of the page you wish to archive into the form, click the "Submit" button. When archiving process completes (it usually takes 5-15 seconds) you will be sent to the archived page.
It is recommended that you view the archived page to check if the archive process has been successful.

Bookmarklet

Put simply, a bookmarklet is a web browser bookmark which instead of going to a web page, performs a certain function. With the Archive.is bookmarklet, you click the bookmark, it takes the URL of the page you are currently looking at and submits it to Archive.is for archiving. This method is easy to set up, easy to use and is fast. To get the most out of this method, it is recommended that you have your Bookmarks/Favorites bar visible or at least have your bookmarks accessible within a click or two. This method only allows you to archive the page you are currently looking at, to archive a different web page you will have to use another method.

To set up the bookmarklet, go to http://archive.is.
Drag the gray button archive.is into your Bookmarks/Favorites bar. You may need to hold Shift key if you use Opera.
To use the bookmarklet, simply click on it when you are on a web page you wish to archive. It initiates archiving process. When archiving process completes (it usually takes 5-15 seconds) you will be sent to the archived page.
It is recommended that you view the archived page to check if the archive process has been successful.

Firefox smart keyword

Firefox smart keywords are commonly used to perform searches through the Firefox address bar or to open a bookmark by typing a keyword into the Firefox address bar. Here we are going to use a smart keyword to submit a URL to Archive.is for archiving. This method is moderately simple to set up, easy to use and is fast.

To set up the smart keyword, hit Ctrl+Shift+B to open up your Bookmarks Library (or by clicking the orange Firefox button on the top left of the window, then going to "Bookmarks", then "Show All Bookmarks")
Browse to a location you would like to save the smart keyword bookmark in.
In the menu at the top of the window, click "Organize", then "New Bookmark".
Enter a name for the bookmark (e.g. Archive.is).
Enter http://archive.is/?run=1&url=%s into the Location field.
Enter a keyword for the bookmark. You should choose something short and this keyword must not already be used for another bookmark. (e.g. wc)
Click the "Add" button. Close the Bookmarks Library.
To use the smart keyword, add the keyword you chose ("wc" in the above example) followed by a space (" ") in front of the URL of the web page you would like to archive in the Firefox address bar. (e.g. If you are using "a" as your keyword, the text in the address bar would be a http://www.example.com/pageyouwantoarchive.html).
Hit Enter. It initiates archiving process. When archiving process completes (it usually takes 5-15 seconds) you will be sent to the archived page.
It is recommended that you view the archived page to check if the archive process has been successful.

Chrome search engine

Although this is created through Chrome's search engine feature, this functions just like a smart keyword in Firefox. This method is moderately simple to set up, easy to use and is fast.

To set up the "search engine", right click the address bar and select "Edit search engines...". At the bottom of the list that comes up, you can add a "search engine".
Enter a name for the "search engine" in the first field (e.g. Archive.is).
Enter a keyword for the "search engine" in the second field. You should choose something short and this keyword must not already be used. (e.g. wc)
Enter http://archive.is/?run=1&url=%s& into the third field
Hit Enter to save the "search engine".
To use the "search engine", add the keyword you chose ("wc" in the above example) followed by a space (" ") in front of the URL of the web page you would like to archive in the Chrome address bar (e.g. If you are using "a" as your keyword, the text in the address bar would be a http://www.example.com/pageyouwantoarchive.html).
Hit Enter. You will be sent to a page containing a link to the archive URL of the web page you wished to archive.
It is recommended that you view the archived page to check if the archive process has been successful.

Use within Misplaced Pages

Links archived with Archive.is may appear in two formats. The first format uses a 4- or 5-letters "Snapshot ID," similar to URL shortening services, to provide a more convenient link: http://archive.is/XXXX The second format displays the original URL and the date of archiving within the URL itself: http://archive.is/YYYYMMDDhhmmss/http://www.example.com or http://archive.is/YYYYMMDD/http://www.example.com. Either is appropriate for use within Misplaced Pages.

This archive URL can be inserted into the archiveurl= and its supporting archivedate= and deadurl= parameters in any of the citation templates. If the original URL is no longer accessible, the deadurl parameter value should be set to yes. If the original URL is still accessible, the deadurl parameter value should be set to no.

<ref>{{cite web |last= |first= |title= |work= |publisher= |date= |url= |archiveurl= |archivedate= |deadurl= }}</ref>

Searching for previously archived web pages

Web pages previously archived through Archive.is are accessible through a searchable database. Users may search by URL, domain or their wildcards.

References

http://wiki.dandascalescu.com/reviews/online_services/web_page_archiving
http://www.robotstxt.org/orig.html
Webcite http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1550686/ "Copyright issues are addressed by honouring respective Internet standards (robot exclusion files, no-cache and no-archive tags)."

Notes

Archive.is FAQ: A page may not be archived for a number of reasons. Archive.is does not support archiving Portable Document Format files, audio and video. The page may be too big (there is 50mb limit for a single page). The content may be inaccessible from the Archive.is network (this is particularly likely if you are attempting to access subscription based content which your institution subscribes to on its users' behalf). Also, the content may be unreadable by the Archive.is archiver (too complex JavaScript based pages can crash its browser or be executed too long time, or ones involving browser checks sometimes cause our archive engine to fail).

Category:

Misplaced Pages how-to