Transition from Blogger to Github Pages (Jekyll)

Hear ye hear ye! I am transitioning my personal website/blog from Blogger to Github Pages. If you are searching for my portfolio or previous posts, please head over to gremsi.com for the time being.

Detecting URLs/Links Clicked on a Webpage

This has more to do with WebView in Windows Store APIs but it could apply to other situations since its just javascript. I wanted to find a way to detect what page the user navigates to, and unfortunately WebView does not have that functionality. So I had to resort to injecting javascript...

Now I'm not a javascript expert, so if there are any better ways to do this please let me know :)


Instead of changing each and every link and adding my own custom handlers, I decided to override the onclick event in the body. Whenever the user clicks anywhere on the page, my custom onclick handler will receive the event and handles it accordingly. 


It first checks to see if it is inside an <a> tag, if it is then we are pretty much done since we found the link. If it isnt, it continues to check the parent tag until it reaches the body. This handles cases where you have an <img> tag inside an <a> tag. So when the user clicks on the image, the onclick event will be received on the image and not the <a> tag. So we have to traverse upwards.


Here is the code:


Link Detection Script



 document.body.onclick = function(e)
 {
 //If element has a tag type of a, then return href tag but if element is of another type, check its parent to see if it is embedded in an A tag, if not keep on
 //checking parents until it reaches the top most tag (html)
    var currentElement = e.target;
    while(currentElement.nodeName!='HTML')
    {
       //console.log('Parent Node: '+parent.nodeName);
       if(currentElement.tagName == 'A')
       {
          if(currentElement.href.indexOf('javascript:')==0)
          {
             window.external.notify('{\'id\':\'message_printout-'+GenerateID()+'\',\'action\':\'message_printout\',\'message\':\'Link was clicked with javascript void or some javascript function\'}');
             return true;
          }
          var rel = currentElement.rel;
          var target = currentElement.target;
          var newpage = false;
          if(rel=='external' || target=='_blank')
            newpage = true;
          window.external.notify('{\'id\':\'leaving_page-'+GenerateID()+'\',\'action\':\'leaving_page\', \'url\':\'' + currentElement.href +'\', newpage:\''+newpage+'\'}');
         return false;
       }
       currentElement = currentElement.parentNode;
    }
}
 return true;
 }
Note: The window.external.notify code is specific to WebView inside Windows Store. What I'm doing here is basically notifying my application from javascript inside the WebView. So when a link is clicked I would get a message with the url, which I will then handle my self. You could just replace window.external.notify with console.log or your own function call.

This should detect links for 80% of the cases. The 20% cases will be iFrames, and dynamic websites that use jquery and ajax. You would have to handle iframes separately, look at each iframe, find its document element and execute this javascript inside it. 


For other complex websites, the script above might not detect on click events. An example would be mail.yahoo.com, when you load an email, the link detection script above would not detect any clicks inside the email body. I wasn't able to figure out why this is happening other than the onclick is being handled by some other script. So for these small cases I just altered the url (inside the href tag) to call my function. It would look like this:  



function CustomOnClick(url, newpage)
{
   //console.log('link-detect: ' + url + ' ' + newpage );
}


function linkReplacementScript()
{
    var aTagList = document.getElementsByTagName('a');
    for(i = 0; i<aTagList .length; i++)
    {
      var url = aTagList[i].href;
      var rel = aTagList[i].rel;
      var target = aTagList[i].target;
      aTagList[i].rel = '';
      aTagList[i].target = '';
      var newpage = false;
      if(rel=='external' || target=='_blank')
         newpage = true;
      if(url.indexOf('javascript:')==0)
     {
        //do nothing if its javascript code
      }
      else
      {
        aTagList[i].href = 'javascript:CustomOnClick(\''+url+'\',\''+newpage+'\');';
      }
   }
}
But then there are cases where the dom is altered after the page has loaded. An example of this would be dynamic websites that insert content using ajax/jquery. For this case we would have to detect when the dom has changed and then call the link replacement script above. There is a way to detect this using MutationObserver (see this post). 

Basically whenever you get a dom updated event, you would call the link replacement script. Note: the link replacement script is only for cases where the link detection script has failed. It's sort of a catch all fallback just incase. 

Proxy and WebView (with Windows Store APIs)

I was looking into setting proxy for WebView for Windows Store applications. Unfortunately most of the methods I tested did not pan out. 
Method #1: Using .NET APIs
There is no setProxy method in the WebView Class unfortunately. There is a post on MSDN about WebView supporting proxy: http://social.msdn.microsoft.com/Forums/windowsapps/en-US/13287270-1d49-4d23-aa89-9360673c81ef/proxy-settings-for-webview#637e21ad-5b3d-46fb-8d32-93c3d0b72b65
The MSFT Employee said WebView will use IE proxy settings. However you would be able to set proxy for custom HTTP requests with HTTPClient and HTTPWebRequest. (More on this later).
Method #2: Reflection
I attempted to use reflection to see if there are any hidden methods that I could call inside the WebView class. Unfortunately this method did not pan out also. I should probably say right now that none of the methods panned out except the last one (kind of).
So using reflection, I wasn't able to view private members. After a bit of researching I found out that BindingFlags had to be set to Private or class variables (forget the actual syntax). But since I was doing this in a Windows Store application I was unable to set the binding flags (the windows store apis did not support this). 
So then I had to create a PCL (portable class library) and then use reflection with the binding flags set to private variables and see if I was able to find any set proxy members. The only thing I found was a hasProxyImplementation method. But no way to set the actual proxy for WebView.
Method #3: 3rd Party Components
Another option was to use some kind of 3rd party alternative for WebView. But no luck with finding something like this. Someone did work on a very alpha stage simple alternative (http://stackoverflow.com/questions/13497556/windows-store-webview-alternative) but nothing that replaces WebView's functionality. 
Method #4: App Level Proxy
There isn't a way to set app level proxy programmatically. I believe it was Windows 8.1 where you can set proxy for metro apps through the metro settings menu.
Method #5: Setting IE Proxy
And as far as I know, there is no way of setting proxy for IE.
Method 6: Intercepting HTTP Requests from WebView
Intercepting Requests & Proxy
After looking into it, I was able to get one method to work. By listening into a socket using HttpService and point the WebView source to "http://locahost:[port#]" I am able to intercept the initial request that the WebView makes to the socket.
I make my own request using HttpClient (after the initial request comes in). Take a look how to add a proxy to Http Request Messages: Proxy with HTTP Requests
When I get the response back, I write the headers and the content back to the WebView. The WebView then display the page.
Handling External Links
To handle external links, you would have to inject javascript to detect when a link has been clicked, capture that link and give it back to your app with window.external.notify. (Remember, images can be enclosed in a link tag too)
With this method, you would get additional requests for images and javascript files that need to be loaded. Lets say we have an image /logo.png on the page. Since the webview is poitned to http://locahost:[port#], the webview will make the request for the image on http://locahost:[port#]/logo.png. This request would come through the socket and you would have to handle this your self.
Now for images/external resources that have absolute links (http:// google.com/logo.png), the WebView would handle this by itself and you would not receive a request through the socket. To make this go through the socket (and proxy) you would need to change the actual link on the page. For example, when the initial request comes through for http://localhost:[port#], you would make your own request with HttpClient to http:// google.com (as an example). When i get the response back, the content of that response would have the content of index.html. You would need to replace all absolute links (like http:// google.com/logo.png) with your own relative links or with some other identifier, http://locahost:[port#]/?q=[url]. This way when the WebView loads the image, it would go through your local socket/server that you are listening into.
I have implemented the method above (with replacing links), and my results were mostly successful. I was replacing anything that started with http:// and sometimes this would replace the content on the page. You could look for href tags but you have to remember that there could be ajax requests with absolute urls that you would have to replace. And ajax requests would not have href tags. This is just one example, and there might be others you would have to account for.
My Thoughts
Now after going through all of this, I would not recommend this method. If you want to display a simple page and you know the user won't navigate to external websites and what not, then this might work.
I was having problems with POST requests and random headers with various websites (that needed to be removed in requests and responses). There are also too many things to account for with absolute links/images/ajax requests/sql queries/etc. Microsoft needs add in this functionality in their WebView APIs. Till then, there isn't a good way of making WebView go through proxy.