Digging into Gmail URLs

August 22nd, 2018 Posted by Uncategorized 0 thoughts on “Digging into Gmail URLs”

A little curiosity can go a long way in digital forensics!

One of our recent cases involved an ongoing dispute between two executives who we’ll call Alice and Eve. Their dispute escalated when Alice returned after a day out of the office and noticed that her Gmail account was open on a shared computer they both used. Alice became suspicious that someone had accessed her Gmail account (she had forgotten to log out of it when she was last in the office) while she was gone. One of Alice’s coworkers told her that Eve had been using the shared computer on the day in question. Alice took a quick look at the Chrome web browser’s history, which seemed to confirm her suspicion — she saw activity which appeared to be related to her account while she was away. Alice reached out to her lawyer with her concerns, and her lawyer reached out to us.

Alice’s lawyer wanted to know if we could identify:

  1. Affirmative action on the part of Eve to look at Alice’s emails
  2. Signs of data exfiltration from Alice’s email account

It can be difficult to place a particular person behind a keyboard at a particular time, but thanks to the eye witness we knew who was using the shared computer on the day in question. The challenge for us, or so we initially thought, was to demonstrate affirmative action on the part of Eve — which is what we will focus on in this article (setting aside the data exfiltration question for the time being). Could Eve state with some plausibility that she found Alice’s Gmail account open on the shared computer, but didn’t take any action to interact with it?

Account Identification

While examining Internet history on the shared computer we noticed very quickly that it appeared as if multiple accounts were logged into Gmail at the same time. More specifically, we noticed the presence of “/u/#” (where ‘#’ is a number) appended to the URL “mail.google.com/mail”. We know from our testing that as additional accounts are logged into, at the same time on the same browser, the number in “/u/#” increments accordingly. This information is particularly useful in tracking account activity over time.

At this point, we knew that both Alice and Eve’s Gmail accounts were open on this computer at approximately the same time. But we were still looking for evidence of affirmative action on the part of Eve to interact with Alice’s account. As we will see, Gmail provides us some additional clues in its URLs which includes things like folder names, actions such as composing a new message, search results, and even which page of the search results were being viewed.

Folders & Messages:

Gmail URLs may include the string “#folder” where folder is the name of the folder being viewed. In the example below we see “#inbox”, referring to the Inbox folder, and “#inbox/p43”, which refers to the 43rd page of the emails in the Inbox folder being viewed.

If a particular message is being viewed, the folder name will be followed by what we called a “message ID”; the message subject may also be visible in the page title, as in these examples:

In this example a message with the subject “FW: ACME”, in the “Sent” folder, was being viewed:

We knew at this point that it was possible that someone (presumably Eve) viewed the following messages in Alice’s account:

  • Message “Reconnect” – message ID  “1629b99e355e1662″
  • Message “Taxes” – message ID “162c5c6bd9ca793e”
  • Message “FW: ACME” – message ID “162b6ed6bcf08f77″

Search:

In addition to viewing particular emails in Alice’s Gmail account, someone (again, presumably Eve) performed a simple search as well as an advanced search.

A simple search occurs when someone types a search string into the “search mail” box at the top of the Gmail page:

In the following example, a search was run for “[redacted]@umich.edu”:

An advanced search can be performed by clicking the “down arrow” on the far right of the simple search box. The advanced search allows for an additional level of granularity by specifying things like “only search for read emails”:

Below are URLs we observed which would result from performing an advanced search for “Read Mail” and “Date within 1 day [of none]”. The top entry below reflects someone viewing a message with the ID “15f0cfb6fd2a1101”, the middle entry shows someone viewing page 103 “p103” of the search results, and the bottom entry shows someone viewing page 102 “p102” of the search results:

At this point we felt fairly confident that whomever was at the computer was actively interacting with Alice’s Gmail account. Why? We noticed:

  • Browsing through multiple pages of the Inbox folder
  • Viewing particular emails in the Inbox and Sent folders
  • Simple and advanced searching with subsequent email viewing

We shared our findings with our client who was ultimately able to use this information to support their argument in the larger dispute. There was one more thing that we noticed which did not end up playing a role in this case (because we had an intact copy of Internet history) and it had to do with message composition…

Message Composition:

Let’s first take a look at what some of the possible URLs associated with message composition look like.

Composing a new message from the Inbox:

Composing a new message (now with a message ID):

Composing two new messages at once (note the presence of “%2C” separating the message IDs):

Composing a new message while also viewing message “162f2d83bcda8a47”:

Now that we’ve established what message composition looks like, let’s turn our attention back to this thing that caught our eye…

Decoding Message IDs

We observed the following series of URLs which appeared related to message composition, but the frequency of them was remarkable. What was happening here? Either Eve was a prolific writer or what we’ve called a message ID is more than meets the eye. (Yes, I just used that phrase.)

As with anything in digital forensics that we don’t understand, we were curious and started testing. We immediately noticed that the message IDs appeared to be hexadecimal values and that they appeared to increment over time. We opened up a new message window and waited. The URL did not change from “compose=new” to “compose=[messageID]”. Next we typed in a recipient email address and waited… the URL changed and we now had a message ID in the URL! So we waited again, for about 5 minutes this time, and the URL didn’t change. We then started typing in a subject and a body to the message, pausing occasionally. As we did this we noticed two things (1) the message ID would change and, around the same time, (2) we would see “saved” pop up in the bottom of the window. Some additional testing appeared to confirm this theory. Great, so maybe what we have been calling the message ID is a timestamp?

So, let’s look at the ID — a 16-character hexadecimal number. If we work under the premise that the ID is a timestamp, what is it based on and what does it represent? We know some timestamps are measured from 1601, and others from 1970. Some timestamps are measured in seconds, milliseconds, or microseconds. In this case it appears that none of these combinations make sense so we took our testing a step further and introduced nanoseconds.

We know April 25, 2018 1:00 PM (local time) is one of the dates identified by our Internet history report. If we try using the message ID (162fdbfea62de936) divided by nanoseconds (to convert it to seconds) and run it through Python’s datetime.fromtimestamp function, we see the following:

 

 

We’re close (in the grand scheme of time), but as you can see we are off by a few years, months, days, hours, minutes which won’t do us any good during a case. We found similar discrepancies in any of the IDs we tried to convert. We sent Arnau Gàmez, who is working with us this summer, down this rabbit hole. Arnau originally hypothesized that this difference was the result of some fixed offset (i.e. a fixed number of seconds). Through methodical testing his understanding evolved and he arrived at a number — what we called “Arnau’s Number” or “1.048576” — which if we applied to our function, we got the following result:

 

 

That looked like our target date (Apr 25, 2018 – 1:00 PM). We tested messageIDs from 1999 to present and found Arnau’s Number to consistently resolve this unusual offset.

After all this methodical testing, Arnau eventually found (and Brian also Googled) that 1.048576 (or, 1,048,576) was simply a representation of mega as applied occasionally in computing. The kind of thing that sometimes results in misunderstandings when one vendor is quantifying megabytes by counting in increments of 1,000KB as opposed to 1,024KB.

Arnau took the hard route to decoding these message IDs, but most importantly he found the value that would result in accurate timestamp decoding!

PS – Our curiosity and eventual findings paid off quickly in a subsequent case…

While working on this article, we were engaged in another case requiring us to analyze Internet History in order to find evidence of who was using a shared computer at a particular time. Our examination led us into unallocated space… where we found webpage titles (which identified a Gmail account) along with Gmail URLs containing message IDs. So you might ask us, why didn’t we look at the timestamps for those Gmail URLs and call it a day? Because there were no timestamps associated with them. We had to apply our message ID findings and convert the messageID values to timestamps (and keep in mind, these are timestamps produced by Google, not the computer in question, which many of our colleagues in digital forensics will appreciate), which helped us establish that there was particular Gmail account activity on particular (and quite important) days.

 


 

Please support us, as we work to make maximum exploitation of electronic evidence more accessible, by learning more about the powerful and unique functionality our tools provide. You can learn more about our tools at https://ArsenalRecon.com/#products. Thank you!

Tags: