Welcome guest, is this your first visit? Create Account now to join.
Results 1 to 5 of 5

This is a discussion on View Page Source as string within the Plugins Development section, part of the Chrome Plugins category: Does anyone know how to get the page source of a page as a string without using XMLHttpRequest? The closest ...


  1. #1
    twinsen is offline Member
    Join Date
    Aug 2009
    Posts
    31

    Default View Page Source as string [solved]

    Does anyone know how to get the page source of a page as a string without using XMLHttpRequest?

    The closest I have got is "pageSource = window.document.documentElement.innerHTML;"
    But that is not identical to the original source.

    XMLHttpRequest works, but it is a bit slow. I was hoping to be able to read from the in memory cache. Opening a new tab with "view-source:url" is quick, I would like to get that speed.

    I've also tried a hidden IFRAME with src "view-source:url", but that just says "Failed to load resource". I can't find any way of creating an invisible chrome window or tab to try extracting from it.

    Here is one page of stuff I was trying:
    http://stackoverflow.com/questions/2...ge-source-code
    Last edited by twinsen; 06-16-2010 at 04:26 PM.

  2. #2
    gildas is offline Junior Member
    Join Date
    Feb 2010
    Posts
    26

    Default

    Hi twinsen,

    Here's a sample code I use in SingleFile to get the current document HTML content, it's not the exact content but it is very near.

    Code:
    function getDoctype(doc) {
    	var docType = doc.doctype, docTypeStr;
    	if (docType) {
    		docTypeStr = "<!DOCTYPE " + docType.nodeName;
    		if (docType.publicId) {
    			docTypeStr += " PUBLIC \"" + docType.publicId + "\"";
    			if (docType.systemId)
    				docTypeStr += " \"" + docType.systemId + "\"";
    		} else if (docType.systemId)
    			docTypeStr += " SYSTEM \"" + docType.systemId + "\"";
    		if (docType.internalSubset)
    			docTypeStr += " [" + docType.internalSubset + "]";
    		return docTypeStr + ">";
    	}
    	return "";
    }
    
    var htmlContent = getDoctype(document) + document.documentElement.outerHTML;
    I hope it can help you.

  3. #3
    twinsen is offline Member
    Join Date
    Aug 2009
    Posts
    31

    Default

    Thanks for that, its got some of the stuff that I was missing when using innerHTML.
    The worst time for XMLHttpRequest for a few sample sites was 1.1 seconds, whereas your code worst time was 40 milliseconds.
    I think the minor differences in content are worth it for the speed increase.

  4. #4
    gildas is offline Junior Member
    Join Date
    Feb 2010
    Posts
    26

    Default

    Actually, i was facing the same issue when porting JSONView to Chrome. Some users found that XHR was not acceptable.

    Nevertheless, it seems that chrome will offer soon an extension API to do this kind of stuff more easily.

    Define Chromium extensions API for networking

    Define an API for Chromium extensions to access the network stack. We already defined an API that exposes proxy settings to extensions. (willchan)
    cf : http://www.chromium.org/developers/d.../network-stack

  5. #5
    Waha's Avatar
    Waha is offline Senior Member
    Join Date
    Apr 2009
    Location
    Oregon
    Posts
    788

    Default

    If you want to get it after the page's JS has affected it and all you can do it pretty much the same way you did originally, just with outerHTML:
    Code:
    pageSource = window.document.documentElement.outerHTML
    But yeah, not the raw source like view-source without XHR or a future API like above.
    ~ Projects ~
    Specialized: Carapass Auction Watcher, Kongregate Chat
    Libraries: bliplib
    Tools: manifest syntax highlighting & snippits
    ~ Happy to make extensions for pay too ;D ~
    Portfolio: Search and Share

Similar Threads

  1. Source Code Pull
    By kodemonki in forum Plugins Development
    Replies: 0
    Last Post: 01-28-2010, 06:08 PM
  2. Comparison of speed for some string functions in Chrome
    By Lex1 in forum Plugins Development
    Replies: 3
    Last Post: 10-29-2009, 02:48 PM
  3. View source page always see login
    By joliveira in forum Bugs and Vulnerabilities
    Replies: 1
    Last Post: 10-03-2009, 05:18 AM
  4. Replies: 3
    Last Post: 09-10-2009, 05:32 PM
  5. How can I add a search string?
    By Glennie in forum Chrome Tips & Tricks
    Replies: 2
    Last Post: 07-15-2009, 09:12 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •