I had to perform a small spike for one of my clients – how to download csv files from a website that requires authentication in an automated way.
The actual downloading of the file is quite simple, there are simple APIs in dotNet that do the job just fine. THe problem is how do I get the authentication part done.
My solution was to try to download the files, if I get a redirection to the login page, open a browser and let the user log in. Once he is logged in, transfer to cookies from the browser to the download api and then try to download the files again. Here is what the code looks like (based on a WPF application) :
The first thing to do is to download the file, and if that fails, show a browser with the redirected url.
private void DownloadFiles()
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.xxx.com/dl/fileToDownload.CSV");
request.CookieContainer = GetUriCookieContainer(new Uri("http://www.xxx.com"));
// execute the request
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
// this is how the website tells you are not logged in, it doesn't use http error codes but rather
// redirects you to the login page. Each web site will do this differently.
if (!IsLoggedIn(response.ResponseUri))
{
// in this case, just show the form's web browser control and set it's soure to the redirected page
wb.Visibility = System.Windows.Visibility.Visible;
wb.Source = response.ResponseUri;
}
else
{
// hide the browser control, the download seems valid
wb.Visibility = System.Windows.Visibility.Hidden;
//todo : do something smart with the response
}
}
private static bool IsLoggedIn(Uri responseUri)
{
return !responseUri.AbsoluteUri.ToLower().Contains("access.denied");
}
Notice the call to
>GetUriCookieContainer(new Uri("http://www.xxx.com"))
which basically grabs the cookies from the app’s cookie store and allows them to be attached to the request. Here is how the magic occurs (thanks to pinvoke.net).
public static CookieContainer GetUriCookieContainer(Uri uri)
{
CookieContainer cookieContainer = null;
int datasize = 131072; // allocate 128k of memory for interop
StringBuilder cookieData = new StringBuilder(datasize);
if (!InternetGetCookieEx(uri.ToString(), null, cookieData, ref datasize, InternetCookieHttponly, IntPtr.Zero))
{
if (datasize 0)
{
cookieContainer = new CookieContainer();
cookieContainer.SetCookies(uri, cookieData.ToString().Replace(';', ','));
}
return cookieContainer;
}
[DllImport("wininet.dll", SetLastError = true)]
public static extern bool InternetGetCookieEx(
string url,
string cookieName,
StringBuilder cookieData,
ref int size,
Int32 dwFlags,
IntPtr lpReserved);
private const Int32 InternetCookieHttponly = 0x2000;
Now the last problem, how do we know that the user has authenticated himself in the browser control. I accomplished this by simple monitoring the web browser’s navigation for any urls that would hint that the user is not being redirected to the login page.
if (IsLoggedIn(e.Uri))
{
DownloadFiles();
}
And that’s it, you use a browser control to let the user authenticate to the web site and then fetch the authentication cookies to use them to automate downloads.

Leave a comment