Introduction

Documents identified by computer forensic investigations in civil litigation typically require review and analysis by attorneys to determine if the uncovered evidence could support causes of action such as breach of contract, breach of fiduciary duty, misappropriation of trade secrets, tortious interference, or unfair competition. In addition, bit-for-bit forensic imaging of workstations is also commonly used as an efficient method to quickly gather evidence for further disposition in general commercial litigation matters. For example, instead of relying upon individual custodians to self-select and copy their own files, forensic images of workstations can be accurately filtered down to exclude system files, which only a computer can understand, and identify files which humans do use such as Microsoft Word, Excel, PowerPoint, Adobe PDF files and email. In any of the above situations, be it a trade secrets type matter or a general commercial litigation case, litigants are always highly sensitive to the potential costs associated with attorney review.

Now that Microsoft Windows 8 workstations are available for sale and will likely be purchased for use by corporate buyers, civil cases involving the identification and analysis of emails from such machines is a certainty. Recently, excellent computer forensic research on Windows 8 performed by Josh Brunty, Assistant Professor of Digital Forensics at Marshall University revealed that “In addition to Web cache and cookies, user contacts synced from various social media accounts such as Twitter, Facebook, and even e-mail clients such as MS Hotmail are cached with the (sic Windows 8) operating system” (source: http://www.dfinews.com/article/microsoft-windows-8-forensic-first-look?page=0,3). Building on Professor Brunty’s scholarship, I set out to determine the extent, amount, and file formats email communications exist on a Windows 8 machine. In addition, a goal was to identify any potential issues for processing locally stored communications for attorneys review in the discovery phase of civil litigation.

As you will see, the format in which Windows 8 stores email locally does in fact present potentially significant challenges to cost effective discovery in both trades secret type matters as well as general commercial litigation cases. Fear not as my conclusion offers some potential solutions as well as other important considerations. I have written this article in detailed steps so that others might more easily duplicate my results.

Testing

My testing was performed on the Release Preview version of Windows 8, so I will be upgrading the subject workstation to the current retail version, re-running my tests and reporting the results in a later publication.

1. Subject Workstation “Laptop”

Manufacturer: Dell Latitude D430
Specifications: Intel Core 2 CPU U7600 @ 1.20GHz / 2.00GB Installed RAM /
OS: Windows 8 Release Preview / Product ID: 00137-11009-99904-AA587
HARD DRIVE: SAMSUNG HS122JC ATA Device / Capacity 114,472 MB

2. Windows 8 Installation

The Dell Laptop originally came with Windows XP Professional installed, but I replaced XP with Windows 8 Release Preview (“W8”) using an installation DVD burned from the W8 .ISO file provided by Microsoft’s website.

3. Windows 8 Preparation

I created a single user account called “User” with a password of “password”. After the W8 initiation phase ended, I was presented with the new “tile” interface, which is much more akin to an iPhone, iPad, Android metaphor. Unfortunately, my Dell laptop did not enjoy a touch screen that would have allowed me to take more advantage of the tiles. Even on this older machine, the built in track pad and other mouse controls all worked perfectly out of the box, so I was able to proceed with installing various communication applications.

A. Connecting the Windows 8 laptop to web based accounts

On W8’s default new tile screen, there are three key tiles I began with; “People”, “Messaging” and “Mail”. Within the “People” tab, I connected my contacts to my Microsoft, Facebook, LinkedIn and Google accounts. Connecting to these external accounts brought in a flurry of contact profile pictures, email addresses, phone numbers, physical addresses, company name, job title and website from LinkedIn. Interestingly, my own record, “Me”, did not import a profile picture from any of my online accounts, leaving a generic silhouette tile. Perhaps LinkedIn, Gmail and Facebook are excluded from choosing my local Windows 8 profile by Microsoft. I do not have a profile picture associated with my Microsoft Live account, which might be the cause of the missing profile picture.

Below is the end-user view under the Windows 8 “Mail” tile showing imported emails from my Google Gmail account:

Inbox: 34
Drafts: 0
Sent items: 15
Outbox: 0
Junk: 0
Deleted items: 22
[Gmail] / All Mail: 34
[Gmail] / Spam: 0
[Gmail] / Starred: 2
[Gmail] / FORENSIC: 1
[Gmail] / Receipts: 0
[Gmail] / Scarab: 2
[Gmail] / Travel: 0

4. End User Installed Applications

I installed the following four applications on the laptop:

A. Programs recorded by the Control Panel:

1. Adobe Flash Player 11 Plugin ver. 11.4.402.287

2. Google Chrome ver. 23.0.1271.64

3. Mozilla Firefox ver. 16.0.2 (x86 en-US)

B. Programs listed under Windows 8’s “Store” tile:

1. Tweetro (I did not link to any Twitter account)

2. Xbox Live Games (using Microsoft account user name “larry_lieb@yahoo.com”)

Using the Chrome browser, I logged into my Google account and installed “Gmail Offline” to see what effect this add-on would have. After installing “Gmail Offline”, the Chrome icon now appears in the system tray by default when viewing the Desktop. I then logged in to a newly created Yahoo account, which I called “larry.lieb@yahoo.com”. I sent and received several emails both two and from my Yahoo/Gmail accounts. While logged into my Yahoo.com email account, I imported contacts from my LinkedIn account. Now that I had created multiple sources of email and instant message correspondences, I set about imaging the laptop.

5. Forensic Imaging

I used Forward Discovery’s Raptor 2.5 (http://forwarddiscovery.com/Raptor) installed to a USB flash drive from the Raptor 2.5 .ISO file using Pendrivelinux.com’s free USB Linux tool. I changed the boot order to USB drive first which then caused the laptop to boot the Raptor 2.5 operating system instead of Windows 8.

Within Raptor 2.5, I used the Raptor Toolbox to first mount a previously wiped and formatted external Toshiba hard drive, which was connected to the laptop via a USB cable. The total imaging and image verification process took close to eleven hours due to the slow USB connection. The internal Samsung hard drive uses a ZIF zero insertion force connector, so although I may have been able to achieve a faster imaging time using my Tableau ZIF to IDE tool (http://www.tableau.com/index.php?pageid=products&model=TDA5-ZIF), I was loathe to tempt equipment failure as Tableau states, “ZIF connectors are not very robust and they are typically rated for only 20 insertion/removal cycles.” In addition, the Tableau kit only comes factory direct with Toshiba and Hitachi cables, which would not work with the Samsung drive.

6. Indexing

Using Passmark’s OSForensics ver. 1.2 Build 1003 (64 Bit) on my Digital Intelligence µFred forensic station (http://www.digitalintelligence.com/products/ufred/), I created an index of the Windows 8 files contained within the Raptor 2.5 created Encase evidence files. OSForensics was able to create an index of the entire contents in around one hour.

Under OSForensics’ “File Name Search” tab, I ran searches for common email file types. Out of 142,712 total items searched, OSForensics identified:

A. 2,204 items using the search string “*.eml”

B. 0 items using the search string “*.msg”

C. 0 items using the search string “*.pst”

D. 0 items using the search string “*.mbox”

Using OSForensics “Create Signature” tab, I was able to run and export a Hash value and file list report for the folder “1:\Users\User\AppData\Local\Packages\”.

7. .EML files

Using AccessData’s FTK Imager 3.1.1.8, I exported the contents of the folder path, “Users\User\AppData\Local\Packages\microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\Indexed\LiveComm\larry_lieb@yahoo.com\”.

I noticed that there are two interesting folders that might warrant different treatment for electronic discovery projects:

A. Location of folder storing .EML files containing email communication:

OSForensics found 264 .EML files under the “Mail” folder path:

“microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\Indexed\LiveComm\larry_lieb@yahoo.com\120510-2203\Mail\”

B. Location of folder storing .EML files containing contacts:

OSForensics found 1,939 .EML files under the “People” folder path:

“microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\Indexed\LiveComm\larry_lieb@yahoo.com\120510-2203\People\”

C. Location of folder storing my “User” .EML contact file:

OSForensics found 1 .EML files under the “microsoft.windowsphotos \..\People\Me” folder path that contains my “User” profile:

“Users\User\AppData\Local\Packages\microsoft.windowsphotos_8wekyb3d8bbwe\LocalState\Indexed\LiveComm\larry_lieb@yahoo.com\120510-2203\People\Me”

Conclusion

In electronic discovery projects that utilize forensic imaging tools to capture workstation hard drives, it is common for data filtering to be requested such as D-NIST’ing, file type, key word, date range and de-duplication. Often times, a file type “inclusion” list will be used to identify “user files” for further processing such as Microsoft Word, Excel, Powerpoint, Adobe PDF, and common email file types such as .PST, .MSG., and .EML. Files found in the forensic image(s) will be exported for further processing and review by attorneys.

One of the challenges attorneys face in electronic discovery is reasonably keeping costs low by avoiding human review of obviously non-relevant files. However, as Windows 8 appears to be storing contacts from LinkedIn, Gmail, and other sources as .EML files, it is apparent that using file type filtering inclusion lists with .EML as an “include” choice, will bring in many potentially non-relevant files.

If an attorney is billing at a rate of $200/hour, and can review fifty documents per hour, then the 1,938 “contact” .EML files alone would require 38.78 hours of attorney review time at a cost to the client of $7,756.00. Therefore, it may make sense for all parties to stipulate that .EML files from the “People” folder be excluded from processing and review unless the hard drive custodian’s contact list is potentially relevant to the underlying matter.

In some cases, litigants do not or cannot pay for outside vendor electronic discovery processing fees and will direct their counsel to simply produce their electronically stored information. I advise against this practice as the potential for producing privileged or protected information exists with this approach. A requesting party may also object to the costs de facto shifted to them with this approach. Nonetheless best practices and economic reality do not always mesh. Parties that wish to take this “no attorney review prior to production” approach with evidence gathered from Windows 8 machines may risk over producing the “contact” EML files to their opponent and should consider the risks associated with not allowing a professional to apply filters to their collection upfront.

Companies that are planning on purchasing and implementing Windows 8 workstations may want to consider altering their IT policies to prevent employees from linking to personal Gmail, LinkedIn and other web based identities to prevent personal communication from being stored locally. I am uncertain if such an option is available within the administrative portion of the Windows 8 operating system, or if employee handbooks and training alone might be available to stop employees from bringing their home to work.

From an ease of trade secrets type computer forensic investigation standpoint, having a suspected former employee’s Gmail communication locally and readily available is excellent; certainly this ease of access is preferable to sending a subpoena to Google to retrieve similar information. However, from this author’s personal experience, general commercial litigation type cases in general vastly outnumber cases involving traditional computer forensic issues. Perhaps companies who take steps to proactively prevent Windows 8 machines in the corporate environment from caching their employee’s personal communication locally may experience significantly less expensive discovery costs in the long run.

Acknowledgments

1. Josh Brunty, Assistant Professor of Digital Forensics at Marshall University
2. David Knutson and Tim Doris of Duff & Phelps for their sage opinions on Linux live CD versus ZIF connector to a hardware write-protection device acquisition approaches.
3. Patrick Murphy and Raechel Marshall of Quarles & Brady for their insight into document production risks.

About

Larry Lieb, ACE, CCA

CIO

Scarab Consulting (www.ConsultScarab.com)

LINK TO PDF: Windows 8 Computer Forensics and Ediscovery Considerations

Image may be NSFW.
Clik here to view.

Windows is the most used operating system worldwide. I have met a lot of IT guys in my country and also other computer elites. My discovery was that 90 percent of them use Windows. I felt maybe that was just in my country, then I decided to contact some friends from UK, USA, India, and Pakistan, and they said the same about the wide use of Windows OS in their countries. However, the case was a bit different for that of the guy in the USA and I also noticed that a lot of my friends there use the MAC OS X. This doesn’t change the fact that Windows is still more used worldwide and because of this, hackers and intruders have had a lot more time to study Windows and create a lot of malware for it. The popular Windows OS has been tagged the most vulnerable OS. Now there is a new Windows OS. The question is: Is it vulnerable as well?

This article focuses on the new version of Windows. Windows 8 was released on October 26, 2012. It was designed to work perfectly on a touch screen. The interface is so catchy!!

As a computer lover, I follow a page on Facebook named “computer freaks.”

Image may be NSFW.
Clik here to view.

Recently, this picture was posted, showing that in the timeline of the Microsoft Windows operating system they have always had a good OS, then a bad OS, and then a good one again. Kind of like an arithmetic progression with a common difference of one among the good Windows operating system. Because of this, I decided to do an analysis on this Windows 8 edition of Microsoft Windows to see what will really make it bad or “SHIT,” as the picture puts it.

I began to do research on Windows 8 and I discovered that three patches have already been released for Microsoft’s new operating system. This reminds me of when Vista was released, there were so many patches that they just had to make a better version of Windows OS (Windows 7). I’ve used Windows 7 for a long time and I’ve also met some Windows 7 power users that can testify that it is a good one from Microsoft. However, I still think Windows XP stands a higher ranking when we focus on system stability.

Speaking more fully of malware, one of Microsoft’s major objectives is to reduce the risk of their OS being infected by malware. As a result, several measures have been taken to reduce the chances of malware infection in Windows 8. Jason Garms of Microsoft has provided some tips on how to keep your PC free from malware in the link below:

http://blogs.msdn.com/b/b8/archive/2011/09/15/protecting-you-from-malware.aspx

Windows 8 has proven to be less vulnerable to malware, because the Windows Defender that comes with it is very active with good heuristics for detecting malwares. Even with all the new security, the common saying still remains true: there is no total security and therefore you cannot be totally secure from malware on Windows 8 but the risk of being affected by malware is just reduced. Windows 8 got better in a lot of ways to the point that their error page had a transformation.

This doesn’t have to do with malware, but one thing I still don’t like about the Windows OS is the inability to retain commands on the console (command prompt) after the cmd is closed and reopened. For those who work more on the console, you can imagine using a lot of very long commands and then, simply because you mistakenly closed this console session, when you open another, all those commands are gone and you need to retype them. This is unlike the terminal (console) in Linux.

Image may be NSFW.
Clik here to view.

I read and heard from different sources that Windows 8 was secure but I am a big time skeptic, so I had to prove it to myself. To be sure of the fact that Windows 8 is not so vulnerable to malware, I had to start by creating a proRAT Trojan server with my Windows 7 machine and then I sent it to my Windows 8. I have tried this Trojan several times and I’m no novice with it. I used it often in the days when I loved threatening schoolmates in the network, and I still have a good handle on it. As soon as I sent the server file to the Windows 8 OS with an external drive, Windows Defender deleted it. This was really amazing. I don’t have any third-party AV installed and my computer could react that way with malware. I had even seen some Windows 7 OS with third-party AVs that will not detect the server file due to poor heuristics. However, one third-party AV you can rely on to some extent is Norton with its bloodhound heuristics. The image below is what I got when I put my Trojan server on Windows 8:

Image may be NSFW.
Clik here to view.

The image below contains the hexadecimal of the Trojan server that was used

Windows 8 is indisputably the most secure Microsoft Windows, but we cannot still match its malware detection with MAC OS X. I realized that the Windows 8 defender that protects us from malware is the popular Microsoft Security Essentials. It’s good the way they saved us the cost of buying Microsoft Security Essentials separately.

There are interesting testimonies everywhere about Windows 8 and its safe usage and security but this doesn’t make it impeccable. Although I’ve not personally found any faults in Windows 8, from my research I discovered that the new Windows version that was released already had its first security patch on November 13, 2012, which was just a few weeks after it was released.

Also, not that I’m very sure about this, but I came across an article that said the Bitdefender company had tested some malware on Windows 8 and one piece of malware had its way with Windows 8. This particular malware is capable of creating backdoors that allow hackers to remotely control the computer of the host and also to steal gaming credentials and a lot more.

However, the company used malware collected over the last six months, which is not ideal, because the test sample won’t include every threat and also because every antivirus product misses some software nastiness, giving a greater chance to the attacker.

Bitdefender also tested the malware by fetching a copy of the malicious code from an internal FTP server and executing it to see how far the malware progresses–as opposed to visiting a booby-trapped web page that attempts to compromise the PC, which is a more common method of infection. In theory, there should be little difference, but this methodology bypasses Windows Defender’s SmartScreen, which filters out phishing attacks and malware downloads when using Internet Explorer.

Well, this is not an issue to make you reconsider using Windows 8, because a lot of antivirus vendors are just trying to find a fault in the Windows Defender (built-in Microsoft Security Essentials) in order to provide a chance for their own AVs.

Image may be NSFW.
Clik here to view.

Another test I tried is the backdoor. I installed WAMP server in Windows 8 on my VMware and I tried to upload a backdoor shell onto it from my host operating system. I kept trying this but to no avail. Then I tried to manually drop the shell into the guest Windows 8 OS server directory. It turned out while cleaning up the file that I received a message notifying me that malware was detected and could not be accessed.

Image may be NSFW.
Clik here to view.

I know c99, c100, GNY, and r57 shells are very well-known and restricted by a lot of anti-malware programs. Because of this, I tried to use a WSO shell, but it was still functionless.

Left to me, I will say that Windows 8 is like a means to put an end to a hacker’s invasion on web servers. Probably, if most webservers on Linux OS are moved to Windows 8, the hackers would have a lesser chance to upload backdoor shells to damage our web contents.

Since some of the antivirus companies have predicted future security shortcomings on the secure Windows 8, we also have to be prepared to keep our PCs protected.
I will start by providing you with good software for analyzing Windows executable files. With this software, you can check to see if there is anything attached with an executable application you want to run on your computer. It is called PEid, which identifies “portable executables.”

Download with the MediaFire link: http://www.mediafire.com/?f2yu4wzbrq3bp2a

If it happens an attacker successfully finds way to drop his malware on your PC, you can also remove it manually from your computer, but you must be very careful because there are some malwares with anti-tracing features that can make your OS crash the moment you detect them.

So to find them, you will go to your registry and follow the given registry keys to check for these malwares:
HKEY_LOCAL_MACHINE\Software\Microsoft\Active Setup\Installed Components
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Currentversion\explorer\Usershell folders
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Run
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\RunServices
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrenVersion\RunServicesOnce
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\RunOnce
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell
HKEY_LOCAL_MACHINE\Software\Microsoft\Active Setup\Installed Components
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\VMM32Files
HKEY_LOCAL_MACHINE\System\CurrentControlSet\services\VxD

The reason for locating the malware yourself is because of polymorphic malwares. These are malwares that make it impossible for antivirus and firewalls to detect them.
Some of the malwares can make themselves run at your system startup by replicating themselves to the following:

Config.sys (a system32 folder)

Autoexec.bat (rootdrive)

System.ini (Windows folder)

Before using the registry keys to locate the files, you may consider disconnecting yourself from the network where it’s likely an attacker is using the malware to attack you.

Image may be NSFW.
Clik here to view.

After locating any suspicious file in the registry keys, you can double-click the file to find its path, as the image below shows:

As a regular Windows user, you should know we can’t delete files that are working in the background, so we need to check for this file in the task manager processes and stop it.

Image may be NSFW.
Clik here to view.

That’s not malicious software, but it’s the exact file I examined in the registry. If it were malicious, I would just click the “End Process” button to put an end to its work. Now you can go back to the directory you were given by the registry and delete the file. After this, you will need to restart your PC if you know the malicious software has not caused much damage to your computer. If it has eaten some of your system files up, you may need to upgrade your Windows OS by using an installation CD to go through the installation process again. This way your files remain intact in a folder, “WindowsOld,” in your C: directory.

I have given a link to download the PEid. Now I will show you a little way to make use of this software in examining your executable files.

When you unpack the rar file I gave in the link, you will see an interface like the one below:

Image may be NSFW.
Clik here to view.

Do not mind the Windows Explorer look of my own software, it’s just a skinpack. On your PC, the three blue dots should be minimize, maximize, and close button. To check details about a particular exe file, you can select the file in the first option of the PEid GUI.

Image may be NSFW.
Clik here to view.

Now to check the active processes that may include the malware, you can click “Task Viewer,” which gives you a result like the task manager does. When you select any task, it will show every file attached to the process and working with it. PEid is a really good solution for malware detection.

Image may be NSFW.
Clik here to view.

Windows 8 Defender uses the colors green, yellow, and red to show its security level. To make your Windows 8 more secure from malwares, I will advise that you should update Windows Defender as often as possible, as you would any third-party antivirus if you really want to stay secure.

Image may be NSFW.
Clik here to view.

Sometimes malware will be placed in software that you already have on your PC. For instance, suppose you downloaded a game that was functioning properly before it started malfunctioning. It is advisable to do an md5 checksum on downloaded files so when it gets suspicious, you can do a checksum again to compare with the previous test and then you will be able to say if it has been tampered with. You can download software for checking your md5 on Windows here:

http://www.4shared.com/zip/qsq6WC8O/NetTools4574.html

Joseph Orekoya is a security researcher for InfoSec Institute. InfoSec Institute is a security certification company that has trained over 15,000 people including popular CEH and CCNA certification courses.

References

http://www.pcworld.com/article/2013807/windows-8-already-getting-security-patches.html

http://blogs.msdn.com/b/b8/archive/2011/09/15/protecting-you-from-malware.aspx

http://www.anandtech.com/show/4822/windows-8-malware-protection-detailed

http://windows.microsoft.com/en-US/windows-8/windows-defender#1TC=t1

http://www.theregister.co.uk/2012/06/21/win8_security/

http://propellerheadforensics.files.wordpress.com/2012/05/thomson_windows-8-forensic-guide2.pdf

Image may be NSFW.
Clik here to view.

Bad Sector Recovery

Hard drives are built in a way so that they never return unreliable data. This means that if a hard drive cannot guarantee 100 percent accuracy of the data requested, it will simply return an error and will never give away any data at all.

This article explains how bad sector recovery actually works and why it needs to be done with great caution.

Understanding Bad Sectors

General causes of bad sector formation are physical or magnetic corruption. Physical corruption is easy to understand—it occurs when there is physical damage done to the media surface. Magnetic corruption occurs when a hard drive miswrites data to a wrong location. While the latter may seem to be less damaging, it is actually as dangerous as physical damage, as miswritten data may damage not only adjacent sectors but also servo sectors.

Image may be NSFW.
Clik here to view. sector

Regardless of the cause of damage, there are several possible outcomes:

Address Mark field corruption
Data corruption
ECC field corruption
Servo sector corruption
Or any combination of these

What is common in all these types of corruption is that your operating system or normal data recovery tools cannot read the data from those sectors anymore.

Let’s find out exactly what happens when a tool tries to read a sector that has one of the above-mentioned problems.

Address Mark corruption

When Address Mark is corrupted, the hard drive simply cannot find the requested sector. The data might still be intact, but there is no way for the hard drive to locate it without the proper ID. Some modern hard drives do not actually use sector ID or Address Mark in the sector itself; instead, this information is encoded in the preceding servo sector.

Data corruption

To verify data integrity, a hard drive will always validate it with the error checking and correction algorithm using the ECC code written after the data field (see above diagram). When data is corrupted, the hard drive will try to recover it with the same ECC algorithm. If correction succeeds, the drive will return the sector data and will not report any error. However, if correction fails, the drive will only return an error and no data, even if the data is partially intact.

ECC field corruption

Although this is rare, the ECC code can also get corrupted. In this case, the drive reads perfectly good data from the sector and checks its integrity against the ECC code. The check fails due to the bad ECC code, and the drive returns an error and no data at all, because there is no way to verify data integrity.

Servo sector corruption

There are up to a few hundred servo sectors on a single track. Servo sectors contain positioning information that allows the hard drive to fine-tune the exact position of the head so that it stays precisely on track. They also contain the ID of the track itself.

Servo sectors are used for head positioning in the same way a GPS receiver uses satellites—to exactly determine the current location. When a servo sector is damaged, the hard drive can no longer ensure that the data sectors following the servo sector are the ones it is looking for and will abort any read attempt of the corresponding sectors.

How Bad Sector Recovery Works

Once again, hard drives are built to never return data that did not pass integrity checks.

However, it is possible to send a special command to the hard drive that specifically instructs it to disable error checking and correction algorithms while reading data. The command is called Read Long and was introduced into ATA/ATAPI standard since its first release back in 1994. It allowed reading the raw data + ECC field from a sector and returning it to the host PC as is, without any error checking or correction attempt. The command was dropped from the ATA/ATAPI-4 standard in 1998; however, most hard drive manufacturers kept supporting it.

Later on, when hard drives became larger in capacity and LBA48 was introduced to accommodate drives larger than 128 GiB, the command was officially revived in a SMART extension called SMART Command Transport or SCT.

Obviously, since the drive does not have to verify the integrity of data when the data is requested via the Read Long command, it would return the data even if it is inconsistent (or, in other words, the sector is “Bad”). Hence, this command quickly became standard in bad sector recovery.

There is also another approach which is based on the fact that some hard drives leave some data in the buffer when a bad sector is encountered. However, our tests have shown that chances of getting any valid data this way are exactly zero.

Debunking Bad Sector Recovery

So to “recover” data from a bad sector, one would simply need to issue the Read Long command instead of the “normal” Read Sectors command. That is really it! It is so simple any software developer who is familiar with hard drives can do it. And sure enough, more and more data recovery tools now come with a Bad Sector Recovery option. In fact, it has come to the point where if a tool does not have a bad sector recovery feature, it automatically falls into a second-grade category.

Error checking and correction algorithms were implemented for a reason, which is data integrity. When a hard drive reads a sector with the Read Long command, it disables these algorithms and hence there is no way to prove that you get valid data. Instead, you get something, which may or may not resemble your customer’s data.

Tests in our lab had shown that, in reality, by using this approach, you will get much more random bytes than anything else. Yes, there are cases where this approach allows recovering original data from a sector, but these cases are extremely rare in real data recovery scenarios, and even then, only a part of the recovered sector will contain valid data.

Even when we got some data off the damaged sector, what exactly should we do with its other (garbled) part? And how exactly do we tell which part of the sector has real data in it and which is just random bytes? Nobody is going to manually go through all the sectors in a HEX editor and judge which bit is valid and which is not. Even if someone did, there is no way to guarantee that what they see is valid data.

And this is where the real problem starts.

Dangers of Read Long approach

Imagine a forensic investigator recovering data off a suspect’s drive while the drive has some bad sectors on it. To get more data off the drive, the investigator enabled Bad Sector Recovery option in his data acquisition tool. In the end, his tool happily reported that all the sectors were successfully copied, so he began extracting data from the obtained copy.

While looking for clues, he found a file that had social security numbers in it. He then used these numbers in one way or another for his investigation.

What he did not know is that one of the sectors that contained these numbers was recovered via the Read Long command, and some bits were flipped (which is very common for this approach). So instead of 777-677-766, he got 776-676-677, causing him and other people a whole lot of unnecessary trouble.

Another example: when recovering a damaged file system, even slightly altered data in an MFT record can mislead the file recovery algorithm and in the end do much more harm than if there was no data copied at all in that sector.

Once again, an error checking and correction algorithm is in place for a great reason. There is absolutely no magic in bad sector recovery; it is impossible to recover something that just isn’t there.

There are tools that claim better bad sector recovery because they utilize a statistical approach, an algorithm where the tool reads the bad sector a number of times and then reconstructs the “original” sector by locating the bits that occur most often in the sector. While these tools claim this approach could improve the outcome, there is no evidence to back up the validity of such claims. Furthermore, rereading the same spot many times while the hard drive is failing is a good way to cause permanent damage to the media or heads.

To summarize, if you are after valid data, avoid using any bad sector recovery algorithms. These algorithms will never offer data integrity no matter how complex their implementation is. And when you absolutely must recover data from bad sectors, make sure you use a tool that properly accounts for these recovered sectors, marking the files containing such sectors. This way, the operator has the ability to disregard such “unreliable” files and manually verify file integrity if it is an important one.

Image may be NSFW.
Clik here to view.

Dmitry Postrigan is the founder and CEO of Atola Technology, a Canadian company that makes high-end data recovery and forensic equipment.

Image may be NSFW.
Clik here to view.

As “the Cloud” (a varied mix of internet based services ranging from web-based email accounts, on-line storage and services that synchronise data across multiple computers) becomes more relevant and the dominance of the PC or tablet as the exclusive “home” for data reduces, the days when simply taking a snapshot of a computer to capture all available data have gone.

For a number of years Google have offered online solutions for creating, editing and publishing a range of files including Word and Excel. More recently this service has linked into Google Drive, which offers more functionality, but also allows users to synchronise data across various devices. An individual with a Google Drive account is allocated 5Gb of free data storage and can obtain further storage at a cost.

Image may be NSFW.
Clik here to view.Google Drive supports many of the file types and formats we work with every day including docx, xlsx and pptx. However, any user who creates a new file for example a word processing file via the Google drive website will by default generate a file with an extension of .gdoc (.gslides for a presentation and .gsheet for a spreadsheet). These file types are Google formatted files and are synchronised across all devices running Google Drive.

Running Google drive on a standard Windows PC, by default creates a folder within the user profile. On a Windows 7 PC the Google drive data is stored in the following path C:\Users\USERNAME\Google Drive. Google drive will quite robustly deal with maintaining and synchronising the data as changes are made and a successful synchronisation of the data is indicated by a small green tick and an out of date synchronisation by blue arrows, as shown below.

Image may be NSFW.
Clik here to view. gDocs

By virtue of the gdoc file being created on a PC during the synchronisation process it generates and maintains its own metadata on the PC. By right clicking on the gdoc file and viewing properties you see the usual dates and times (created, modified and accessed).

As you would expect the created date reflects the time the file was first created and successfully synchronised on the PC.

The important point to note is if a file is created via the Google website at 1200 on 1^st January 2013, but the PC with Google drive installed is not connected to the internet until 1630 on 10^th January 2013, then the creation date of the gdoc file on the PC will show 1630 on 10^th January 2013 – because this was the when the PC sync’d with the Google Drive website. The modification and accessed dates update as you would expect, with the same limitations associated with the created date.

The valuable metadata is stored on the Google drive servers, however this presents us with a challenge:

how do we gain access to the account?
and how do we get the metadata out?

One important piece of metadata maintained by Google is the revision history.

The revision history is a cross between “track changes” and a backup solution, where Google “snapshots” the data when changes are made and so as to permit users to jump back to any version of the file, prior to those changes having been made, at the click of a button.

This means that it is possible to see what a document looked like several days ago after a number of changes to the content have been made. This is fantastic information, however it is not readily available to download or capture in an offline form.

Instead, this data can only be captured by communicating with the Google Drive using its own coding API. This is tricky and a challenge, nevertheless with the appropriate programming skills the revision history data can be captured.

If we take a deeper look at the gdoc file which is created on a PC we notice it is tiny and only 1Kb in size. The reason for this is because the content of the actual file is not stored on your PC. The gdoc file is nothing more than a pointer to the data on the Google Drive Server.

If we look inside the gdoc file it contains a URL which itself is a unique reference to the data on the Google Drive systems and only those with appropriate account credentials can view the data. This is true of gslides and gsheet files also.

There are important considerations when dealing with Google formatted data including documents, spreadsheets and presentations.

First; forensically imaging PCs with Google Drive installed and Google formatted files stored on the PC is an incomplete exercise because, although the PC holds pointers to data held on the Google server, it does not hold the actual data.
Second, there is a huge amount of valuable information stored on Google drive about files, in particular the revision history. Where Google Drive is in use, efforts should be made to harvest this data with a view to building, if necessary, a more detailed picture of the evolution of the file.

For clarity, I should add that files in a non-Google format that are stored in a user’s Google Drive are synchronised and stored in full on users PCs: they do not adopt the same pointer system that is utilised by Google formatted files.

Keep an eye on our blog page for future posts relating to this topic

Image may be NSFW.
Clik here to view.

Artifacts all over

Nowadays, SQLite databases became a very popular, common forensic resource; the new quasi-standard for storing information. They are found on smartphones, hard disc drives, thumb drives etc.. Unlike other database environments like mySQL, MSSQL or other SQL derivates, SQLite is a self-contained, serverless, zero-configuration, transactional SQL database engine that simply relies on one single file to store content. The scope of this article is to give an introduction on SQLite investigations and present an examination scheme simple enough for everyone who is in the need (even without any SQL-knowledge).

Getting all SQLite databases from an evidence

Unfortunately, SQLite databases do not share a standard file extension. But instead, every SQLite file has a specific 16-byte header:
Image may be NSFW.
Clik here to view.

In order to find and extract all SQLite databases from a certain device, a recursive fileheader based find command can be used to copy all SQLite database files to a temporary directory.

find . -type f | xargs grep "^SQLite" -sl | xargs -I {} cp {} /tmp/mydbs/

If the above line of code is not working due to limited xargs or cp commands (e.g. NOT WORKING ON AN ANDROID DEVICE) try this:

1) find . -type f | xargs grep "^SQLite" -sl > /tmp/mydbs/list_relative.txt
2) sed 's/\./\/data\/data/g' /tmp/mydbs/list_relative.txt > /tmp/mydbs/list_absolute.txt
3) ( cat /tmp/mydbs/list_absolute.txt; echo //tmp/mydbs ) | xargs cp

Common sources for valuable SQLite artifacts

Application	Filename	Tables of Interest	Data / Comments
Microsoft Skype	main.db	messages, calls	Chat Logs
Google History	History	urls, visits	Internet history log
Google Archived History	Archived_History	urls, visits	Internet history log (archive)
Google Cookies	Cookies	cookies	Cookie date and timestamps
Google History Index YYYY-MM	e.g. History Index 2012-09	pages_content	indexed webpages
Mozilla History	places.sqlite	moz_visits, moz_places	Internet usage protocol
Mozilla Cookies	cookies.sqlite	moz_cookies	cookie times
Mozilla Downloads	downloads.sqlite	moz_downloads	download times
iOS Apple Addressbook	AddressBook.sqlite	ABPerson, ABMultiValueEntry	quite complex though
iOS Apple Calendar	calendar.sqlitedb	event	Event dates
iOS Caller History	call_history.db	call	Caller log
iOS Apple Notes	notes.db	note, note_bodies	Full notes
iOS Apple SMS	sms.db	message	Saved SMS conversations
Google Android Webhistory	History, Archived_History	urls, visits	Chrome Webhistory
Google Android Calendar	calendar.db	events, reminders	Event Dates
Google Android Contacts	contacts2.db, profile.db	contacts, calls	Call Logs, Contact Information
Google Android DropBox	db.db	dropbox, camera_upload, pending_uploads	Info on uploaded files, etc.
Google Android SMS/MMS	mmssms.db	sms, pending_msgs	SMS, MMS w. attachments
Google Android Emails	EmailProvider[Body].db	message, attachment, body	attn: body is huge, work selective

But there a lot more, to be sure!

Problems to address

The main problem investigating SQLite databases is diversity. Although SQLite is a well supported information storage system, the structure of databases may even differ among two different versions of the same application. Automatic extraction utilities have to be updated frequently or normalize columns in order to provide overall compatibility. The result might at least lack for information or end up in a software exception error.

In order to provide a reliable investigation, forensic examiners must have a look inside the database for themselves. But unfortunately not every forensic expert is a complete SQLite expert as well.

SQLite Basics

Giving a complete tutorial on SQL is not in the scope of this article (no complex SQL statements here). However, there are at least three to five things every forensic investigator should have heard of, when doing SQLite forensics. With SQLite installed on your machine (go here if not), there exist a command line utility to open and work with SQLite databases.

Step 1: How to open a database (here places.sqlite)

> sqlite3 places.sqlite
SQLite version 3.7.12 2012-04-03 19:43:07
Enter ".help" for instructions
Enter SQL statements terminated with a ";"

Step 2: How to list and identify the table(s) of interest

sqlite> .tables
moz_anno_attributes moz_favicons moz_keywords 
moz_annos moz_historyvisits moz_places 
moz_bookmarks moz_inputhistory 
moz_bookmarks_roots moz_items_annos

Step 3: How to list the content of a specific table

sqlite> SELECT * FROM moz_places;
[...]750|place:type=3&sort=4|||0|1|0||0||TRKFw5Df4EZv
751|place:transition=7&sort=4|||0|1|0||0||o_cNoGdIXY50
752|place:type=6&sort=1|||0|1|0||0||OQi-Dp25NDpc
753|place:folder=TOOLBAR|||0|1|0||0||o1NMlcxZr6aU
754|place:folder=BOOKMARKS_MENU|||0|1|0||0||UgJi8dfDmCQc
755|place:folder=UNFILED_BOOKMARKS|||0|1|0||0||bqzv5qULCSKx

Step 4 How to export data to CSV

sqlite> .output places.csv
sqlite> SELECT * FROM moz_places;

Step 5 How to finally exit the SQLite interpreter

sqlite> .exit

But command line investigation is not everyone’s dream.

SQLite Tools with graphical user interfaces

Software developers or IT professionals also use graphical SQLite utilities to create, examine or delete SQLite content or databases. Common examples are SQLiteBrowser or the FireFox extension SQLiteManager.

Image may be NSFW.
Clik here to view.

left) SQLiteBrowser (source forge); right) SQLiteManager (Firefox Extension)

Although the use of a GUI-software seems to be easier at first glance, the complexity of SQLite in general or in using these applications in special should not be underestimated.

Working read only

Another problem with tools like the ones mentioned above is, that they explicitly offer the possibility to manipulate the content, which is not very forensic sound.

Generating human readable output

When investigating SQLite databases, the examiner will encounter, many different formats like:

Different time formats, e.g.
- Timestamp in seconds since 01.01.1970 00:00:00 UTC (Unix timestamp) very popular
- Timestamp in milliseconds since 01.01.1970 00:00:00 UTC
- Timestamp in microseconds since 01.01.1970 00:00:00 UTC (PRTime) e.g. Mozilla Firefox
- Timestamp in milliseconds since 01.01.1601 00:00:00 UTC (Webkit-Time) e.g. Google Chrome
- Timestamp in seconds since 01.01.2001 00:00:00 UTC (CFAbsoluteTime) typical for Apple
Specific flags e.g.
- 0 = no / 1 = yes
- odd = out / even = in (e.g. found in iOS sms.db)
Software specific types e.g.
- Mozilla Firefox (-> visit-types, reference)
Different formatting stuff, e.g.
- Line breaks (-> problem when exporting to/from CSV-textfiles)
- Html tags (-> unpleasantly to read)
and many more

Even though IT-forensic people should be able to address and interpret all of the above mentioned formats, investigating personel might need a more human friendly, i.e. human-readable output.

SQLiteExtractorLE

This is, where SQLiteExtractorLE comes into place! The main focus, when programming this tool, was to provide a simple way of investigating SQLite databases (without any knowledge of SQL) and export to MS Excel file format with direct conversation of the different column-formats.

Simple SQLite Browser

At first glance, SQLiteExtractorLE is a yet another SQLite browser. You can open SQLite databases and browse the content of tables with one single selection from the drop down box in (1).

Image may be NSFW.
Clik here to view.

Browsing Table Data

The main viewing panel, the whole investigation workflow, is divided into 3 parts:

Select table to display
View the table data
Reduce, Convert or Filter column formats

Should be simple enough for everyone. No need to learn nor write any SQLite statement until here.

One more thing: Connecting two tables

In many cases, the data from more than one table has to be combined. SQL people would call this JOIN.

The example below shows a typical Firefox history database (places.sqlite). The table moz_places only provides the information when a website was visited for the last time. In order to generate a complete Firefox history log, i.e. when was a specific website visited actually, not only for the last time, two tables have to be connected (JOINed).

Image may be NSFW.
Clik here to view.

Connecting two tables

Connecting two tables is straight forward in SQLiteExtractorLE. After enabling the checkbox below the table-selection drop down box, the other parts for connecting two tables become active. Now:

Select the primary table and the secondary table
Bring the left key in cover with the right key

From the example above: Every entry in the Firefox history-log (table: moz_historyvisits) corresponds to at least one entry from the URL-repository (table: moz_places) represented as place_id in moz_historyvisits correlating to the URL-specific id within the table moz_places. Sounds a bit weird, but should be simple enough though, just go on and play with it. No need to learn nor write any SQLite anyway. You can even learn how to do that in pure SQLite by simply pressing the “Edit Query” button. More on Freestyle SQLite later.

Generating output

You are done! Almost… Before saving the generated output to file, take a minute to refine the data.

Reducing Columns

First off, reduce unnecessary columns! Help the investigator focus on the data needed. For example in the Firefox places.sqlite file, as seen below, there exists a columns rev_host in the table moz_places which simply stores the host name in reverse character order. This column has no benefit for the investigation and therefore might be turned off. By deactivating the checkbox to the left of the column name in the column option panel, the column will be removed from the grid view (see below).

Image may be NSFW.
Clik here to view.

Reducing Columns

Converting Column Data Values

The next step, after removing all unnecessary columns, is converting the technical formats into human readable text. Therefore the drop down box below the column name (e.g. visit_date) has to be used to select the appropriate conversation type (e.g. for timestamps in Firefox, usually PRTime). Afterwards, all values in the column (red) will be converted into human readable format (green).

Image may be NSFW.
Clik here to view.

Converting Timestamps

Filtering

Sometimes, when taking a first view of the data, it might be helpful to concentrate on keywords. On base of java regular expressions, the values of a specific column can be filtered. Therefore, the keywords have to be typed in the textfield below the column name. In the example below, the column moz_places.url is filtered to only show URLs which include youtube or google.

Image may be NSFW.
Clik here to view.

Filtering using regular expressions

Sorting

Sorting should be the last thing prior to exporting, i.e. saving an Microsoft Excel spreadsheet (xls/xlsx). With SQLiteExtractorLE, it is possible to rearrange the columns by dragging the column headers. This might be handy to lay focus on groups of similar columns.

It might be also interesting to sort the values of a certain column (e.g. the date values from old to new or vice versa). To achieve that, a simple click on the column header will fit.

Image may be NSFW.
Clik here to view.

Exporting

Exporting is very simple. After successful saving the file, it is possible to open it right away.

Image may be NSFW.
Clik here to view.

No big surprise… The result comes as a simple, but optimized spreadsheet. The column widths were adjusted automatically, the first line is fixed and line breaks have been enabled.

Image may be NSFW.
Clik here to view.

Resulting Excel Document

Advanced Freestyle SQLite Queries

Although, many investigations should be possible by “Standard SQL Investigation“, sometimes it is necessary to use the “Freeform SQL Editor“. The best way to start, is to click the “Edit Query” button on the primar Standard-View to switch over.

Image may be NSFW.
Clik here to view.

Simply switching from Standard to Freestyle SQL investigation

Within the Freeform SQL Editor, you can code whatever is specified in the standard SQLite syntax. The extent of complexity is only limited to your SQL-knowledge.

Context Menu

In order to keep things simple in Freestyle Mode, SQLiteExtractorLE provides a fully flavored context menu. It is accessible through a right-click anywhere in the editor view or through CTRL+SPACE on the keyboard.

In the top of the menu, there exists a button to clear the editor panel and another four abbreviation menu items to insert helpfully SQL statements (see below: (1)(2) for generating a investigation overview, (3)(4) for inserting a small or full featured SELECT statements).

Beneath that, there is a menuitem which helps inserting only the names of the columns enabled in the column option panel. Another item helps to uncomment the selected part of a statement, very handy while testing new SQL statement implementations.

Finally it is possible to Load or Save Queries as templates. These may be stored for later reuse or in order to share them with colleagues. Beware that loading a template replaces all text in the editor, though.

Image may be NSFW.
Clik here to view.

contect menu

(1) (NEW) .sqlite-master
SELECT sqlite_master.tbl_name, sqlite_master.sql
FROM sqlite_master
WHERE sqlite_master.sql LIKE '% TABLE %'

(2) (NEW) .schema
SELECT sql FROM 
(SELECT * FROM sqlite_master UNION ALL 
SELECT * FROM sqlite_temp_master) 
WHERE type!='meta' AND sql NOT NULL AND name NOT LIKE 'sqlite_%' 
ORDER BY substr(type,2,1), name

(3) NEW SELECT (mini)
SELECT * FROM table1 JOIN table2 
ON table1.pk = table2.fk 
ORDER BY table1.column DESC

(4) NEW SELECT (big)
SELECT [ALL|DISTINCT] * 
FROM table1 
[JOIN table2 
ON table1.pk == table2.fk] 
[WHERE table1.column1 BETWEEN _begin_ AND _end] 
[ORDER BY table1.column1 [ASC|DESC] [table1.column2 [ASC|DESC]]] 
[GROUP BY table1.column3 
HAVING COUNT(table1.column1) > 1]

Converter Functions and Quick Access for table/column names

In the lower part of the context menu, there exist submenus to further simplify creating SQL statements and reducing typo-errors. In order to convert a given column directly when executing the SQL statement, selecting the column name in the editor panel and then choosing a conversation menu item will do the job. Almost the same procedure for inserting a table or column name. Set the cursor to the place, the naming variable should be inserted and choose from the submenus right at the end of the context menu.

Image may be NSFW.
Clik here to view.

left) converter functions; right) table/column names

Summary

SQLiteExtractorLE may be helpful for people who are not familiar with SQL, but in the need of investigating SQLite databases. And there are a lot of valuable data stored in SQLite databases on almost every evidence. It is also a great tool for learning SQL. And last but not least: it is handy to investigate SQLite Databases and exporting their content to Excel without converting the columns afterwards manually in Excel.

In order to get a copy, simply shoot me an email. Please feel free to contact me for any questions, feature requests etc…

Outlook

Right now I am searching for a generic way to recover deleted database entries. SQLite Carving without templates would be a great benefit over commercial tools like Epilog. Until then, have a look at CCL or forensic-from-the-sausage regarding SQLite Carving.

Image may be NSFW.
Clik here to view.

by Mike Sheward, a contributor to InfoSec Resources.

Digital forensics is one of the most interesting and exciting fields of information security that you can ever be fortunate enough to work in, but not for the reasons you might expect. To those who have never been involved in an investigation, sorry to disappoint, it’s nothing like the movies or TV. There are no magical programs that can unravel the world’s strongest encryption algorithms without the need for a key, right before your eyes. Sure, there are some that will have a good go, and often be successful, but they usually require a good dollop of time and as many hints from the investigator as possible. There are, however, a multitude of processes and procedures that you must follow to ensure digital evidence is handled and processed correctly, before it can even be considered digital evidence at all. Then there’s documentation, which is usually followed by additional documentation.

Now this may sound neither interesting nor exciting, but it is, trust me. Personally, I find that the excitement comes from the significant number of challenges that you face during an investigation. One such challenge is making sure you are doing everything you can to look for evidence, while sticking to forensically sound procedures. A good example of this is when you are asked to acquire volatile evidence. You know that you are breaking the golden rule of digital forensics by interacting directly with live evidence rather than a duplicate, but you have to, otherwise additional evidence could slip away
and be lost forever.

New and evolving technologies also create new challenges for the investigator. Working with a new file system or even just a new type of file can require a change in approach or the development of a new technique. While these changes may require slight alterations to well defined procedures, it is extremely rare to have to deal with a technology that is a complete “game changer”.

The digital forensic community is currently facing one of these rare situations – the rapidly increasing popularity of solid state hard drives (SSD’s).

Forensics investigators and hard drives have developed something of a mutual understanding over the past couple of decades. That is, if we connect them to a write blocker, they’ll tell us everything they know and there is zero chance of them changing their contents and invalidating our evidence. This understanding has come about because the conventional magnetic hard drives sold today are essentially the same as those sold twenty years ago. Okay, so capacities have increased dramatically, and interfaces between motherboard and disk may have been updated to improve performance, but the fundamental inner workings of the magnetic disk have remained unchanged. Manufacturers build magnetic disks in a largely standard fashion, so no matter which production line it rolled off, it’ll behave in a predictable and repeatable manner, and in digital forensics those are two extremely important qualities.

SSD’s are a whole different animal. Like any technology that is new and evolving, manufacturers are still tinkering with the design and implementation. Compared to magnetic drives, there is no standardized approach to producing them. With that, the ancient understanding between forensic investigators and hard drives has been torn to shreds. Suddenly our “bread and butter” predicable evidence source has become a mysterious and secretive device.

So what makes them so different, and why does this affect digital forensics? Well to answer that question, let’s recap how magnetic drives work and compare them to SSD’s. If you’ve never looked into the science of hard drives, it is a fascinating and remarkable process.

Magnetic drives store data by creating extremely small magnetic fields on a thin magnetic coating that is applied to a circular non-magnetic disk, known as a platter. Modern day disks contain multiple platters. The direction of the magnetic field created indicates whether the data stored is a “0″ or a “1″. The surfaces are magnetized using a “write head”, which is an extremely thin but highly magnetic piece of wire that floats just above the surface of platter. To keep data in order, platters are divided into tracks, which are concentric circles that start at the center of the platter and radiate out to the edge. Tracks are further divided into Sectors, which are segments or “pie slices”. When the computer knows which track and sector address a piece of data is stored in, it can translate that into a physical location within the disk. A second head, known as a “read head”, will fly over to that location and access the data.

Image may be NSFW.
Clik here to view.

For optimum performance, a magnetic disk will start recording data on the first available sector, and then continue recording on the next closest free sector. This is to ensure that the read head doesn’t have to jump around all over the place to access an entire file. However, through normal use it is it common for chunks of files to become physically displaced across the disk. A cure for this is defragmenting the disk. This process reduces the time required to access a file by moving the fragmented “blocks” of files closer together.

It’s the job of the operating system’s file system to translate logical mappings into the physical locations on the disk we have just discussed. This is why when you format a drive; its contents appear to vanish. Formatting resets the file systems logical mappings, but doesn’t actually touch the data in the physical disk locations – which is of course why it is still possible to recover data even after a drive has been formatted.

The file system will declare the sector as unused and ready for new data. In a magnetic disk, if new data is written to an unused sector that physically still holds old data, it’s no big deal. The new data simply overwrites the previous data; and the write head only requires one pass across the platter to perform the operation.

The biggest noticeable difference between an SSD and magnetic disk is that there are no moving components – hence the “solid state”. This makes them less susceptible to damage caused by shock. It also makes them lighter and means that they drain less power – all valid reasons for why they are becoming popular choices for mobile computers.

SSD’s are based on the same technology found in USB flash drives. They record data using microscopic transistors, which trap a tiny electric charge. The presence of this charge is what determines if the transistor represents the familiar “0″ or “1″. A fully charged transistor will not allow any more electricity to flow through it; the drive recognizes this and returns a “0″. An uncharged transistor on the other hand, allows current to flow through it, resulting in a “1″. A totally empty drive has all transistors fully charged. Charge can remain in the transistor for years without additional power being required, meaning that data will remain on the device for just as long. The main benefit of this approach is that the time required to write data is reduced significantly when compared to magnetic drives. Transistors can be charged in microseconds, while those Stone Age magnetic drives take milliseconds to create their magnetic fields.

For all the advantages they bring, SSD’s are not without their drawbacks. The methods used by SSD manufactures to compensate for these drawbacks should be of great concern to the digital forensics community.

Whereas a magnetic disk can theoretically be written to an infinite amount of times, an SSD transistor has a comparatively short life expectancy. Typically they can only be written to about 100,000 times before they are likely to fail. So, unlike the magnetic hard disk, which tries to keep blocks of a file as close to each other as possible; an SSD spreads the load across all the unused transistors in the drive randomly. This technique, known as wear-leveling, avoids consistently storing charge in the same group of transistors, which would make them wear out faster. The computer’s operating system is not aware of this process thanks to the SSD’s onboard controller card. The controller presents the operating system with an abstracted list of hard drive sectors. So for example, the OS may think it’s writing a file to sector 15, but it’s actually writing to sector 155. Then, when the OS needs to go back and read sector 15, the controller card will receive that request and know to return the data in sector 155. Later, if the OS overwrites the contents of sector 15, the controller will create a new mapping and write the contents to sector 200 to ensure that sector 155 gets to take a break. You can compare this process to Virtual Memory Address Translation, which performs a similar job between applications and RAM.

This could have implications on a forensics investigation where evidence is stored on an SSD with a damaged controller card. Faced with the same situation on a magnetic disk, replacement of the controller card with one from the same model would allow an investigator to fix the drive and recover the data. However, without knowing the specifics of the wear-leveling mechanisms used by SSD manufacturers, it is impossible to say for certain whether a replacement controller card would know how to translate the correct virtual-to-physical mappings back to the investigators machine or imaging device. If this is a truly random process and only the damaged controller card knows how the mappings have been set up, it wouldn’t. The contents of the drive would be presented as a jumbled up mess, making data recovery an almost impossible task. Worse still, the integrity of the evidence could be called into question, because the image that the investigator acquired would bear no resemblance to the original disk layout.

Another problem faced by SSD manufacturers is that you can’t actually overwrite flash memory, at least not in the conventional sense. Remember, a magnetic hard drive overwrites existing data in just one pass of the write head. SSD’s on the other hand are forced to “erase” the contents of a transistor before they can write new data to it. It’s like parking your car at the mall the weekend before the holidays; you often have to wait for someone to leave their space before you can put your car in it. If you tried simply driving over the car already in the space, this simply would not work. This problem is compounded because SSD’s storage space is divided in blocks, typically 512Kb in size. If just one byte of data has to be updated, the whole block has to be erased before the updated data can be written. This causes a slowdown in performance, because the SSD has to perform multiple “passes” to overwrite data.

To address this issue, manufacturers are believed to be implementing routines that will preemptively erase old data from blocks that are no longer in use by the computers file system. The routines are managed by the SSD’s on-board controller. This is huge, and could represent the single greatest challenge to accepted digital forensics practice to date.

When we attach a magnetic disk to a write blocker, we are absolutely certain that no command that could alter evidence will reach the disk controller, but when the command comes from within, as is the case on an SSD – we have absolutely no control over it.

To explain why this is such a big deal, let’s run through a typical case study. An investigator seizes an SSD from a suspect machine. Having been nervous that he might be about to get caught, the suspect has formatted the drive to cover his tracks. In doing so, the OS marked all the sectors of the drive as unused, including some that still hold incriminating evidence. The investigator takes the seized drive back to the lab and connects it to a write blocker for imaging. Prior to imaging the investigator produces a cryptographic hash of the source drive. During the imaging phase, the powered-on hard drive starts to perform one of it’s on board “clean-up” routines. Its controller erases sectors that contain old data before they can be imaged, and therefore the evidence is lost. When the imaging process is complete, the investigator creates a second cryptographic hash to verify the integrity of the image. When the investigator compares both hashes, there is a difference – caused by the “clean-up” routine removing old data. This makes it impossible for the investigator to confirm the integrity of the evidence and with that, widely accepted forensics best practice is rendered useless.

It’s possible to see this happening by way of a simple experiment. Using WinHex, I wrote to every sector of a 64GB Samsung SSD drive, filling the entire drive with “a” characters. This simulates normal data filling up a drive.

Image may be NSFW.
Clik here to view.

I then proceeded to format the drive in Windows; this simulates the suspect formatting the drive to cover their tracks, and should trigger the SSD controller to start erasing the unused space.

After the formatting was complete I powered down the windows system and connected the SSD to a write blocker. Using FTK Imager, I generated a hash of the drive.

Image may be NSFW.
Clik here to view.

It took about an hour to generate the first hash, so once this had completed I restarted the process and generated a second hash. After another hour, the results are shown below.

Image may be NSFW.
Clik here to view.

This proves that despite the SSD being attached to the write blocker the whole time, the contents of the drive had still changed.

So how does forensics deal with this? In the short term, with no reliable method of repeatedly obtaining the same hash twice, SSD’s will have to be treated just the same as any other volatile evidence source. Investigators will have to rely on both documentation and demonstration skills to show exactly what steps have been taken while working on the evidence, and hope for an understanding jury. This is less than ideal and can’t go on forever. In the longer term, the onus is surely on the manufacturers of these drives. They will need to open up about, or standardize, the way these clean-up routines are implemented. Perhaps all controller cards should be able to receive a “no erase” command from a write blocker, effectively locking them. It would be only a matter of time before someone hacked the firmware of a drive and configured the controller to do just the opposite upon receipt of this command. We are just at the beginning of this challenging phase for digital forensics, and it’s a very interesting and exciting place to be.

This post was written by Mike Sheward, a contributor to InfoSec Resources. InfoSec Institute is a provider of high quality information security training.

Image may be NSFW.
Clik here to view.

Following up from the recent post on Google Drive, designed to give a high level introduction to the product, this post will delve a bit deeper into the technical issues relating to the data stored and also the best approach on how to access it.

The artefacts discussed in this post are based on Windows 7, however Apple Mac operating systems retain similar data in plists (property lists).

By default data from a user’s Google Drive is stored at C:\Users\USERNAME\Google Drive. In addition to this, there are nuggets of information and data stored on a user’s PC.

If we inspect the following location of the Windows registry we are able to learn a lot more about a particular Google Drive setup on a PC and we are also able to confirm the Google Drive product is indeed installed to that PC, by virtue of this key:
KEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Products \227C12A7952F67947BAA66855EDFDEFA\InstallProperties

Within this key we can gather a range of information including when Google Drive was first installed, which is a simple date value in the format YYYYMMDD. In addition to this are version numbers and display names.

As you would expect, there is an entry at HKEY_CURRENT_USER\Software\Google\Drive, but there is little stored here.

Staying within the Windows registry by examining the ‘Run Key’ (HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run) we can confirm if Google Drive is set to autorun on startup, which is the default.

The first registry entry I spoke of in this post
(HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Products\ 227C12A7952F67947BAA66855EDFDEFA\InstallProperties) contains a long string of characters (GUID) and does not actually mention or refer to Google. In my testing I have found the GUID of 227C12A7952F67947BAA66855EDFDEFA is consistent with all Google Drive installations on Windows, therefore searching for this GUID should identify the location of Google Drive data in the registry.

Stepping out of the registry, there is a great deal of data that can be found within the user profile of a user with Google Drive installed in addition to the Google Drive files themselves.

If the path C:\Users\USERNAME\ AppData\Local\Google\Drive exists a few SQLite databases and further settings files can be inspected.

First we have a file called ‘pid’. Inside this file is a number, which is the Windows process ID relating to the Google Drive application. However the really interesting data is within the SQLite database files here.

The smaller of the 2 databases is ‘sync_config.db’ and this amongst other information contains the registered Google Drive account/email address and the location of Google Drive files – which by default is C:\Users\USERNAME\Google Drive.

The larger database is ‘snapshot.db’ and contained within it are several tables holding very valuable information. Each file currently stored and not deleted from Google Drive has corresponding entries in the ‘snapshot.db’ database. These entries detail creation and modification dates in unix epoch format (number of seconds elapsed since midnight (UTC) on 1^st January 1970).

The file names and the link to the files within Google Drive’s web store, which when accessed require the username and password to be provided. Other database entries include a file type, which is referenced by a number instead of the actual file type. There is also an MD5 hash value, which I believe Google use to check for differences in the data during a sync.

If data is deleted from a user’s Google Drive, some interesting things happen. A file deleted from the local Google Drive folder (C:\Users\USERNAME\Google Drive) makes no changes to the ‘snapshot.db’ content, until the Google Drive application is running. Therefore the entry for the deleted file will remain in the database until the Google Drive application is next enabled and synchronised at such time the entry is deleted from the database. Even if this has happened, the information is still potentially retrievable – I have had success recovering these deleted entries from unallocated space.

Deletion of data via the Google Drive web interface is different again. Much like a Windows operating system (or Mac and others for that matter), when you delete a file, it moves to the ‘Bin’ within the Google web interface and remains here until it is restored or further deleted and permanently removed.

As soon as a file is deleted and moved to the ‘Bin’ and a this action is synchronised with the local installation of Google Drive and the entry for that file in the ‘snapshot.db’ is removed from the database.

Switching back to the web interface, despite the deleted file being in the ‘Bin’ users can still work on the file via the web interface and they can also restore it back as a live file. During such time and actions the revision history is not lost.

Once the file is restored and Google drive is synchronised with a local Google Drive client again the entry for that file is added back to the ‘snapshot.db’ complete with the original metadata and importantly the original creation date.

It is important to highlight that the metadata stored within the ‘snapshot.db’ is by far the most accurate and reliable.

In contrast, the metadata of the physical files stored in Google Drive accessed by a right-click and properties action is unreliable.

Take the scenario outlined previous where we have deleted a file and restored it back the creation date shown on the Windows properties will be the date the file was synchronised back from Google Drive. The ‘snapshot.db’ however shows the true creation date, which is when the file was first created before the actions of delete and restore.

We know that the presence of a Google Drive has the potential to assist and our eDiscovery and forensic work provided there is a solid understanding of how it operates and how the data can be captured and interrogated.

What are the practicalities and considerations when working with such data? In my first post I highlighted the need to understand Google Drive data and that native Google file types are little more than placeholders to file content stored within Google Drive servers.

What one also needs to appreciate is the importance of the structured databases generated by Google Drive. It is this structured data where we can recover a host of information about the files within Google Drive – the true metadata if you will.

In terms of approach – because Google Drive can be accessed via the Internet is it essential that in forensic and eDiscovery matters one considers both securing and isolating access to such data immediately. This should include removing network connectivity from computers and/or mobile devices to disable further synchronisations of the data. If this was not done and an individual deleted data from the Google web interface, these changes would occur on computers and/or mobile devices.

One should also revisit the point that an individual can in theory delete permanently data from Google Drive via the web interface. As a result that data must be secured quickly and where possible a legal hold put in place to prevent such data being deleted. You cannot afford to wait or ignore this issue and should try whenever possible to collect data from Google Drive immediately (unless it is not considered to be within scope).

The presence of Google Drive placeholders on computers and/or mobile devices mean there is further work to do in terms of data capture and this work must be performed via the web interface. The Google Drive web interface allows users to download any file or a collection of files locally in a variety of different formats.

One should endeavour to download such data in as close to native format as possible for example a gdoc file as a docx file.

There are several issues to consider when downloading data direct from Google Drive’s web interface including:

the question of gaining access with the username and password
jurisdictional considerations,
the potential for loss of metadata.

As a consequence best practice will be to capture and preserve both Google Drive files AND the Google drive structured databases so as to give as full a picture as possible.

Make sure you keep an eye on the Millnet blog for further updates.

Image may be NSFW.
Clik here to view.

Introduction

File and directory timestamps are one of the resources forensic analysts use for determining when something happened, or in what particular order a sequence of events took place. As these timestamps usually are stored in some internal format, additional software is needed to interpret them and translate them into a format an analyst can easily understand. If there are any errors in this step, the result will clearly be less reliable than expected.

My primary purpose this article is to present a simple design of test data suitable for determining if there are errors or problems in how a particular tool performs these operations. I will also some present some test results from applying the tests to different tools.

For the moment, I am concerned only with NTFS file timestamps. NTFS is probably the most common source of timestamps that an analyst will have to deal with, so it is important to ensure that timestamp translation is correct. Similar tests need to be created and performed for other timestamp formats.

Also, I am ignoring time zone adjustments and daylight savings time: the translation to be examined will cover Universal Coordinated Time (UTC) only.

Background Information

NTFS file timestamps, according to the documentation of the ‘FILETIME’ data structure in the Windows Software Development Toolkit, is a “64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC)”.

Conversion from this internal format to a format more suitable for human interpretation is performed by the Windows system call FileTimeToSystemTime(), which extracts the year, month, day, hour, minutes, seconds and milliseconds from the timestamp data. On other platforms (e.g. Unix), or in software that is intentionally platform-independent (e.g. Perl or Java) other methods for translation is be required.

The documentation of FileTimeToSystemTime(), as well as practical tests, indicate that the FILETIME value to be translated must be 0x7FFFFFFFFFFFFFFF or less. This corresponds to the time 30828-09-14 02:48:05.4775807.

File timestamps are usually determined by the system clock at the time some file activity was performed. It is, though, also possible to set file time stamps to arbitrary values. On Vista and later, the system call SetFileInformationByHandle() can be used; on earlier versions of Windows, NtSetInfomationFile() may be used. No special user privileges are required.

These system calls have a similar limitation in that only timestamps less than or equal to 0x7fffffffffffffff will be set. Additionally, the two timestamp values 0×0 and 0xffffffffffffffff are reserved to modify the operation of the system call in different ways.

The reverse function, SystemTimeToFileTime(), performs the opposite conversion: translating a time expressed as the year, month, day, hours, minutes, seconds, etc into the 64-bit file time stamp. In this case, however, the span of time is restricted to years less than or equal to 30827.

Requirements

Before any serious testing is done, some kind of baseline requirements need to be established.

Tests will be performed mainly by humans, not by computers. The number of test points in each case must not be so large as to overwhelm the tester. A maximum limit around 100 test points seems reasonable. Tests designed to be scored by computer would allow for more comprehensive tests, but would also need to be specially adapted to each tool being tested.
The currently known time range (0×0 to 0x7FFFFFFFFFFFFFFF) should be supported. If the translation method does not cover the entire range, it should report out-of-range times clearly and unambiguously.That is, there must be no risk for misinterpretation, either by the analyst or by readers of any tool-produced reports. A total absence of translation is not quite acceptable on its own — it requires special information or training to interpret, and the risk for misinterpretation appears fairly high. A single ‘?’ is better, but if there are multiple reasons why a ‘?’ may be used, additional details should be provided.
The translation of a timestamp must be accurate, within the limits of the chosen representation.We don’t want a timestamp translated into a string become a very different time when translated back again. The largest difference we tolerate is related to the precision in the display format: if the translation doesn’t report time to a greater precision than a second, the tolerable error is half a second (assuming rounding to nearest second) or up to one second (assuming truncation) If the precision is milliseconds, then the tolerable error is on the corresponding order.

TEST DESIGN

Test 1: Coverage

The first test is a simple coverage test: what period of time is covered by the translation? The baseline is taken to be the full period covered by the system call FileTimeToSystemTime(), i.e. from 1601-01-01 up to 30828-09-14.

The first subtest checks the coverage over the entire baseline. In order to do that, and also keep the number of point tests reasonably small, each millennium is represented by a file, named after the first year of the period, the timestamps of which are set to the extreme timestamps within that millennium. For example, the period 2000-2999 is tested (very roughly, admittedly) by a single file, called ‘02000’, with timestamps representing 2000-01-01 00:00:00.0000000 and 2999-12-31 23:59:59.9999999 as the two extreme values (Tmin and Tmax for the period being tested).

The second subtest makes the same type of test, only it checks each separate century in the period 1600 — 8000. (There is no particular reason for choosing 8000 as the ending year.)

The third subtest makes the same type of test, only it checks each separate year in the period 1601 — 2399. In these tests, Tmin and Tmax are the starting and ending times of each single year.

The fourth subtest examines the behaviour of the translation function at some selected cut-off points in greater detail.

These tests could easily be extended to cover the entire baseline time period, but this makes them less suitable for manual inspection: the number of points to be checked will become unmanageable for ‘manual’ testing.

Test 2: Leap Years

The translation must take leap days into account. This is a small test, though not unimportant.

The tests involve checking the 14-day period ‘around’ February 28th/29th for presence of leap day, as well as discontinuities.

Two leap year tests are provided: ‘simple’ leap years (2004 – year evenly divisible by 4), and ‘exceptional’ leap years (2000 – year even divisible by 400).

Four non-leap tests: three for ‘normal’ non-leap years (2001, 2002, 2003) and one ‘exceptional’ non-leap tear (1900 — year is divisible by 100).

More extensive tests can easily be created, but again the number of required tests would surpass the limits of about 100 specified in the requirements.

It is not entirely clear if leap days always are/were inserted after February 28th in the UTC calendar: if they are/were inserted after February 23th, additional tests may be required for the case the time stamp translation includes the day of the week. Alternatively, such tests should only be performed in timezones for which this information is known.

Tests 3: Rounding

This group of tests examines how the translation software handles limited precision. For example, assume that we have a timestamp corresponding to the time 00:00:00.6, and that it is translated into textual form that does not provide sub-second precision. How is the .6 second handled? Is it chopped off (truncated), producing a time of ’00:00:00′? Or is it rounded upwards to the nearest second: ’00:00:01′?

In the extreme case, the translated string may end up in another year (or even millennium) than the original timestamp. Consider the timestamp 1999-12-31 23:59:59.6: will the translation say ’1999-12-31 23:59:59′ or will it say ’2000-01-01 00:00:00′? This is not an error in and by itself, but an analyst who does not expect this behaviour may be confused by it. If he works after an instruction to ‘look for files modified up the end of the year’, there is a small probability that files modified at the very turn of the year may be omitted because they are presented as belonging to the following year. If that is a real problem or not will depend on the actual investigation, and if and how such time limit effects are handled by the analyst.

These tests are split into four subgroups, testing rounding to minutes, seconds, milliseconds and microseconds, respectively. For each group, two directories corresponding to the main unit are created, one for an even unit, the other for an odd unit. (The ‘rounding to minutes’ test use 2001-01-01 00:00 and 00:01. In each of these directories files are created for the full range of the test (0-60, in the case of minutes), and timestamped according to the Tmin/Tmax convention already mentioned.

If the translation rounds upwards, or round to nearest even or odd unit, this will be possible to identify from this test data. More complex rounding schemes may not be possible to identify.

Tests 4: Sorting

These tests are somewhat related to the rounding test, in that the test examines how the limited precision of a timestamp translation affects sorting a number of timestamps into ascending order.

For example, a translation scheme that only includes minutes but not seconds, and sorts these events by the translation string only will not clearly produce a sorted order that follows the actual sequence of events.

Take the two file timestamps 00:00:01 (FILE1) and 00:00:31 (FILE2). If the translation truncates timestamps to minutes, both times will be shown as ’00:00’. If they are then sorted into ascending order by that string, the analyst cannot decide of FILE1 was timestamped before FILE2 or vice versa. And if such a sorted list appears in a report, a reader may draw the wrong conclusions from it.

The tests are subdivided into sorting by seconds, milliseconds, microseconds and nanoseconds respectively. Each subtest provides 60, 100 or 10 files with timestamps arranged in four different sorting order. The name of these files have been arranged in an additional order to avoid the situation where files already sorted by file names are not rearranged by a sorting operation. Finally, the files are created in random order.

The files are named on the following pattern: <nn>_C<nn>_A<nn>_W<nn>_M<nn>, e.g. ’01_C02_A07_W01_M66′.

Each letter indicates a timestamp field (C = created, A = last accessed, W = last written, M = last modified), with <nn> indicating the particular position in the sorted sequence that timestamp is expected to appear in. The initial <nn> adds a fifth sorting order (by name), which allows for the tester to ‘reset’ to a sorting order that is not related to timestamps.

Each timestamp differs only in the corresponding subunit: the files in the ‘sort by seconds’ have timestamps that have the same time, except for the second part, and the ‘sort by nanoseconds’ files differ only in the nanosecond information. (As the timestamp only accommodates 10 separate sub-microsecond values, only 10 files are provided for this test.)

The test consists in sorting each set of files by each of the timestamp fields: if sorting is done by the particular subunit (second, millisecond, etc.) the corresponding part of the file name will appear in sorted order. Thus, an attempt to sort by creation time in ascending order should produce a sequence in which the C-sequence in the file name also appears in order: C00, C01, C02, … etc, and no other sequence should be the same ascending order.

An implementation with limited precision in the translated string, but that sorts according to the timestamp values will sort perfectly also when sorting by nanoseconds is tested. If the sort is by the translated string, sorting will be perfect up to that smallest unit (typically seconds), and further attempts to sort by smaller units (milliseconds or microseconds) will not produce a correct order.

If an implementation that sorts by translated string also rounds timestamps, this will have additional effects on the sorting order.

Tests 5: Special tests

In this part, additional timestamps are provided for test. Some of these cannot be created by the documented system calls, and need to be created by other methods.

0x00FFFFFFFFFFFFFF
0x01FFFFFFFFFFFFFF
0x03FFFFFFFFFFFFFF
…
0x7FFFFFFFFFFFFFFF

These timestamp can be set by the system calls, and may not have been tested by other test.

0×0000000000000000

This timestamp should translate to 1601-01-01 00:00:00.0000000, but it cannot be set by any of the system calls tested.

0×8000000000000000
0xFFFFFFFE00000000
0xFFFFFFFF00000000
0xFFFFFFFFFFFFFFFE
0xFFFFFFFFFFFFFFFF

These timestamps cannot be set by system call, and need to be edited by hand prior to testing.

These values test how the translation mechanism copes with timestamps that produce error messages from the FileTimeToSystemTime() call.

Other tests

TZ & DST — Time zone and daylight saving time adjustments are closely related to timestamp translation, but are notionally performed as a second step, once the UTC translation is finished. For that reason, no such tests are included here: until it is reasonably clear that UTC translation is done correctly, there seems little point in testing additional adjustments.

Leap seconds — The NTFS timestamp convention is based on UTC, but ignores leap seconds, which are included in UTC. For a very strict test that the translation mechanism does not take leap seconds into account, additional tests are required, probably on the same pattern as the tests for leap years, but at a resolution of seconds.

However, if leap seconds have been included in the translation mechanism, it should be visible in the coverage tests, where the dates from 1972 onwards would gradually drift out of synchronization (at the time of writing, 2013, the difference would be 25 seconds).

Day of week — No tests of day-of-week translation are included.

Additional Notes

A Windows program that creates an NTFS structure corresponding to the tests described has been written, and used to create a NTFS image. The Special tests directory in this image have been manually altered to contain the timestamps discussed. Both the source code and the image file is (or will very shortly be) available from SourceForge as part of the ‘CompForTest’ project.

It must be stressed that the tests described should not be used to ‘prove’ that some particular timestamp translation works as it should: all the test results can be used for is to show that it doesn’t work as expected.

TEST RESULTS

As the test image was being developed different tools for examination of NTFS timestamps were tried out. Some of the results (such as incomplete coverage) was used to create additional tests.

Below, some of the more interesting test results are described.

It should be noted that there may be additional problems that affect the testing process. In one tool test (not included here), it was discovered that the tool occasionally did not report the last few files written to a directory. If this kind of problem is present also in other tools, tests results may be incomplete.

Notes on rounding and sorting have been added only if rounding has been detected, or if sorting is done by a different resolution than the translated timestamp.

Autopsy 3.0.4:

Timestamp range:

1970-01-01 00:00:01 – 2106-02-07 06:28:00
1970-01-01 00:00:00.0000000 is translated as ’0000-00-00 00:00:00′

Timestamps outside the specified range are translated as if they were inside the range (e.g. timestamps for some periods in 1673, 1809, 1945, 2149, 2285, etc. are translated as times in 2013. This makes it difficult for an analyst to rely only on this version of Autopsy for accurate time translation.

In the screen dump below, note that the 1965-1969 timestamps are translated as if they were from 2032-2036.

Image may be NSFW.
Clik here to view.

EnCase Forensic 6.19.6:

Timestamp range:

1970-01-01 13:00 — 2038-01-19 03:14:06
1970-01-01 00:00 — 12:00 are translated as ” (empty). The period 12:00 — 13:00 has not been investigated further.

Remaining timestamps outside the specified ranges are also translated as ” (empty).

The screen dump below show the hours view of the cut-off date 1970-01-01 00:00.The file names indicate the offset from the baseline timestamps, HH+12 indicating an offset of +12 hours to 00:00. It is clear that from HH+13, translation appears to work as expected, but for the first 13 hours (00 — 12), no translation is provided, at least not for these test points.

Image may be NSFW.
Clik here to view.

ProDiscover Basic 6.5.0.0:

Timestamp range:

1970-01-02 — 2038, 2107 — 2174, 2242 — 2310, 2378 — 2399 (all ranges examined)

Timestamps prior to 1970-01-02, and sometime after 3000, are uniformly translated as 1970-01-01 00:00, making it impossible to determine actual time for these ranges.

Timestamps after 2038, and outside stated range are translated as ‘(unknown)’.

Translation truncates to minutes.

The following screen dump shows both the uniform translation of early timestamps as 1970-01-01, as well as the ‘(unknown)’ and the reappearance of translation in the 2300-period. (The directories have also been timestamped with the minimum and maximum times of the files placed in them.)

Image may be NSFW.
Clik here to view.

WinHex 16.6 SR-4:

Timestamp range:

1601-01-01 00:00:01 — 2286-01-09 23:30:11.
1601:01:01 00:00:00.0000000 and .00000001 are translated as ” (blank).

Timestamps after 2286-01-09 23:30:11 are translated partly as ‘?’, partly as times in the specified range, the latter indicated in red. The cut-off time 30828-09-14 02:48:05 is translated as ” (blank).

Image may be NSFW.
Clik here to view.

Additional Tests

Two additional tests on tools not intended primarily for forensic analysis were also performed: Windows Explorer GUI and PowerShell command line. Neither of these provide for additional time zone adjustment: their use will be governed by the current time configuration of the operating system. In the test below, the computer was reset to UTC time zone prior to testing.

PowerShell

Timestamp range:

1601-01-01 00:00:00 – 9999-12-31 23:59:59

Timestamps outside the range are translated as blank.

Sorting is by timestamp binary value.

The command line used for these examination was:

 Get-ChildItem path | Select-Object name,creationtime,lastwritetime

for each directory that was examined. Sorting was tested by using

 Get-ChildItem path | Select-Object name,creationtime,lastwritetime,lastaccesstime | Sort timefield

The image below shows sorting by LastWriteTime and nanoseconds (or more exactly tenths of microseconds). Note that the Wnn specifications in the file names appear in the correct ascending order :

Image may be NSFW.
Clik here to view.

Windows Explorer GUI:

Timestamp range:

1980-01-01 00:00:00 — 2107-12-31 23:59:57
2107-12-31 23:59:58 and :59 are shown as ” (blank)

Remaining timestamps outside the range are translated as ” (blank) .

It must be noted that the timestamp range only refers to the times shown in the GUI list. When the timestamp of an individual file is examined in the file property dialog (see below), the coverage appears to be full range of years.

Additionally, the translation on at least one system appears to be off by a few seconds, as the end of the time range shows. Additional testing is required to say if this happens also on other Windows platforms.

Image may be NSFW.
Clik here to view.

However, when the file ’119 – SS+59′ is examined by the Properties dialog, the translation is as expected. (A little too late for correction I see that the date format here is in Swedish — I hope it’s clear anyway.)

Image may be NSFW.
Clik here to view.

Interpretation of results

In terms of coverage, none of the tools presented above is perfect: all are affected by some kind of restriction to the time period they translate correctly. The tools that comes off best are, in order of the time range they support:

PowerShell 1.0 (1601–9999)
Windows Explorer GUI (1980–2107)
EnCase 6.19 (1970–2038)

Each of these restricts translations to a subset of the full range, and shows remaining timestamps as blank. PowerShell additionally sorts by the full binary timestamp value, rather than the time string actually shown.

The Windows Explorer GUI also appears to suffer from an two-second error: the last second of a minute, as well as parts of the immediately preceding second are translated as being the following minute. This affects the result, but as this is not a forensic tool it has been discounted.

The tools that come off worst are:

Autopsy 3.0.4
ProDiscover Basic 6.5.0.0
WinHex 16.6 SR-4

Each of these show unacceptably large errors between all or some file time stamps and their translation. ProDiscover comes off only slightly better in that timestamps up to 1970 are all translated as 1970-01-01, and so can be identified as suspicious, but at the other end of the spectrum, the translation error is still approximately the same as for Autopsy: translations are more than 25000 years out of register. WinHex suffers from similar problems: while it flags several ranges of timestamps as ‘?’, it still translates many timestamps totally wrong.

It should be noted that there are later releases of both Autopsy and ProDiscover Basic that have not been tested.

It should probably also be noted that additional tools have been tested, but that the results are not ‘more interesting’ that those presented here.

How to live with a non-perfect tool?

Identify if and to what extent some particular forensic tool suffers from the limitations described above: does it have any documented or otherwise discoverable restrictions on the time period it can translate, and does it indicate out-of-range timestamps clearly and unambiguously, or does it translate more than one timestamp into the same date/time string?
Evaluate to what extent any shortcomings can affect the result of an investigation, in general as well as in particular, and also to what extent already existing lab practices mitigate such problems.
Devise and implement additional safeguards or mitigating actions in the case where investigations are significantly affected .

These steps could also be important to document in investigation reports.

In daily practice, the range of timestamps is likely to fall within the 1970–2038 range that most tools cover correctly — the remaining problem would be if any outside timestamps appeared in the material, and the extent to which they are recognized as such and handled correctly by the analyst.

The traditional advice, “always use two different tools” turns out to be less than useful here, unless we know the strengths and weaknesses of each of the tools. If they happen to share the same timestamp range, we may not get significantly more trustworthy information from using both than we get from using only one.

A. Thulin
(anders@thulin.name)

Image may be NSFW.
Clik here to view.

There are many classifications as far as forensic data collection is concerned, but much of it is still a de facto and Wild West when it comes to naming convention. This is especially true in the embedded system area.

When I refer to embedded systems, I think of specialized devices, sometimes in a larger system or machine. Embedded systems usually have at least one microprocessor with dedicated program, and limited options to extract the information in a sound forensic way. Cell phones, smart phones, tablets, DVD and BluRay players, advanced digital watches, TVs, cars, elevators, and even washers & dryers can have embedded systems.

I would like to suggest a more structured way to represent data collection methods for such systems. As this is a work in progress, I look forward to constructive criticisms that can benefit the forensics community.

The classification is broken down into six methodologies.

Manual acquisition
Logical acquisition
Pseudo-physical acquisition
Support-port acquisition
Circuit read acquisition
Gate read acquisition

Each methodology has their shortcomings and benefits. I categorized these into four areas, and ranked them in a scale of 1 to 10, with one for “least” and 10 “most”.

Destructiveness is the impact on the target device, and how likely that everything is fully functional after data collection.

Technical & Training is the required understanding and education in the area required to attempt the methodology.

Cost is simply the expenses involved with the resources required, such as equipment, tools and consumables, to attempt the methodology.

Forensically Sound, the final measurement, is how likely the original data is modified, knowingly or not.

Image may be NSFW.
Clik here to view.

Acquisition Methodology Comparison

Manual

This is the oldest and least training and equipment required methodology. The examiner takes advantage of the devices display and user interface and a camera to record all relevant information, as much as possible. The target device may record all display and user interface activity, and update system data as normal housekeeping.

Example: Secure cell phone in holding bracket, then using the keypad scroll through all relevant items while taking pictures of the cell phone with an external camera. A commercial product used for this kind of acquisition is Paraben Project-A-Phone.

Logical

Logical acquisition method is where the device’s operating system (OS) is in full control of what can be accessed, and provides the method to transfer the data. The examiner connects the device to a forensic workstation, and using various software packages communicates with the OS on the target device. The OS may record the connection, and communication on the target device, and update system data as normal housekeeping.

Example: Connect cell phone’s external port to USB port, using proprietary cable. Run Software to initiate serial communication with device, and request information from device using proprietary and device specific commands. A software, such as BitPim would be used for this type of acquisition.

Pseudo-Physical

The process of pseudo-physical collection involves forcing program code onto the target device in some way which allows access to most data areas. The code may only provide access and takes advantage of the target device’s OS to provide communication, or is a complete replacement of the OS with just collection functionality. Thereafter, the examiner connects the device to a forensic workstation, and using various software packages communicates with the program code, or the OS on the target device. The OS may record the connection, and communication on the target device, and update system data as normal housekeeping. The forced-on code may also impact the information on the target device.

Although often touted as physical acquisition by almost all vendors, this process is not, in my opinion, truly physical as most forensics examiners expect it to be. Most forensics examiners think “bit-by-bit” when they hear “physical”. In my experience, this is not the case, as unallocated and slack areas of the storage are not collected.

Example: Target device is connected to the forensic workstation with a USB to proprietary serial cable. The target device is placed in Device Firmware Update (UDF) mode. The software on the forensic workstation at this time may load a special program code onto the target device. The code allows the software on the forensic workstation to access most information on the target device. Sometimes the target device’s UDF mode software provides the communication features.

Support-Port

Most mass produced electronic devices have ports for testing the electronics, or for updating firmware on various onboard integrated circuitry. These “ports” can be implemented as user accessible ports such as a USB, RS232 or even some pin and socket connector (Molex), non-user accessible ports including pin headers or insulation-displacement connector, and finally test connection pads that appear on the printed circuit assembly (PCA).

To access these ports, almost all small electronics require disassembly, often voiding the manufacturer’s warranty. Once the device is disassembled, the port must be identified on the PCA, and the specific communication protocol must also be found. Communication is established with the specific storage circuitry, and data is requested. This data is then stored for further analysis.

The most often used protocols are Boundary Scan (often referred to by the standardizing group name Joint Test Action Group [JTAG]), Inter-Integrated Circuit (I2C), Serial Peripheral Interface (SPI), Enhanced Synchronous Serial Interface (ESSI), Controller Area Network (CAN), Local Interconnect Network (LIN), and Background Debug Mode (BDM).

Example: The target device is disassembled, and test access points (TAP) are located. Leads are soldered or clamped onto the TAP, and connected to a protocol specific universal asynchronous receiver/transmitter (UART). This device in turn is connected to a USB port of the forensic workstation. Specialized software using circuit-specific commands instructs the on-board device to download data from the circuit. The returned data is stored on the forensic workstation. No information is stored or written to the target device besides the temporary instructions.

Circuit Read

For this acquisition methodology, the integrated circuits (IC) such as memory chips are desoldered from the PCA and data is extracted using chip specific pin-out and communication. This is often referred to as “chip-off” process.

There are several critical points with this method, including the possibility to permanently damage the IC during desoldering, dealing with stacked ICs (3D packaging) or monolithic configuration.

In this particular method, the IC is removed, socketed or soldered, and specific signals are sent to extract the data from the specific chip, using specialized software.

Example: The target device is disassembled, and data storage ICs are located. Pin out information, and timing details for communication with the IC is researched. Target device is preheated, and then the specific ICs are desoldered. The ICs are either placed in temporary sockets, or leads are soldered to appropriate pins. The socket or leads are connected to a communication device using proper communication protocol, such as a Transistor-Transistor Logic (TTL), which in turn is connected to the forensics workstation.

Specialized software using IC-specific commands instructs the socket to download data from the IC. The returned data is stored on the forensic workstation. No information is stored or written to the target IC.

Gate Read

This methodology requires both equipment, and chemicals that are usually not found in most digital forensics labs. The process involves the removal of the target IC in similar fashion as the Circuit Read acquisition methodology. Instead of attempting to communicate with the IC through electronic signals, the chip is literally sliced into multiple layers, to expose each original semiconductor lithographic layer, and information is reverse engineered from the layers.

The layers are measured in nanometers (1 x 10-9 m) or a billionth of a meter. Each layer is removed, photographed, and then reverse engineered from the photograph. The process is as much guess work as it is a very high level understanding of IC internals and IC lithography. The process works best with planarized chips. The steps of the process are device depoting or package removal, delayering, imaging, annotation, schematic, organization and finally analysis.

Example: The target device is disassembled, and data storage ICs are located. Pin out information for the IC is researched. Target device is preheated, and then the specific ICs are desoldered. The IC is bathed in chemicals to remove potting, or encasing. At this point, the only remaining items are the leads to a piece of silicon die. The leads are noted and photographed. The die using lapping (or other very precise slicing or abrasion method) removes each layer, and photographed. The layers are stacked in software, and reverse engineered using the shape, color density and interconnection of the layers. This process requires identification amongst other things the N-type, P-type silicon, the gates, power and ground.

Reference:

	Manual	Logical	Pseudo-Physical	Support-port Read	Direct Circuit Read	Gate Read
Destructiveness	1	1	2	3	5	10
Technical & Training	1	2	3	5	6	9
Cost	1	2	3	3	5	7
Forensically Sound	1	2	5	8	9	7

Rankings are on a scale of 1 to 10, with one for “least” and 10 “most”. Ex. Most destructive would be a 10; Least costly would be a 1.

Image may be NSFW.
Clik here to view.

In recent years it has been noticeable that the amount of people carrying a smart phone has increased exponentially. This is down to their low price and availability; even children as young as 12 have a smart phone. However, most people who own a smart phone are not aware of the data hidden in even the simplest and most innocent things they do on their phones. This includes armed forces staff. This article will look at the issues and possible repercussions of the availability of such easily obtained data.

Let’s consider a scenario: in this case an armed forces staff member is on patrol. they take a picture of themselves and upload it to a social media. Their personal profile on this site is not secured or has limited access that allows anyone to view their photos. A militant group happens to be doing some research on their “enemy”. They use advanced search on Google then happen use the correct collection of words or phrases, and just happens to find this picture. What could possibly happen?

First off, the basics:

What is a geotag?

The method of geotagging is the addition of geographical data into the meta data of an object, in this case a picture that has been taken by armed services personnel.

A geotag on a photograph from an Iphone, for example, captures the GPS coordinates of the location it was taken using Longitude and Latitude.

Obtaining geotag information

Using free tools that are widely available on the internet it can take seconds to reveal the geotag information. It requires very little effort and absolutely no training. Ideal for militant groups who would want to find this information relatively quickly.

Below is an example and for this example I will be using a picture of the blue ball in snooker, but imagine this photo was a team photo taken in a base on foreign soil.

Here I’m using Evigator’s TAGView software

(available @ http://www.evigator.com/)

1 – Locate the image and open it using the Open Image Icon.

Image may be NSFW.
Clik here to view.

2 – Press Open

Image may be NSFW.
Clik here to view.

3 – The Image will be analysed and you will have a screen similar to below:

Image may be NSFW.
Clik here to view.

4 – Sample data from the analysed picture.

Image may be NSFW.
Clik here to view.

As you can see from the above, highlighted is the geotag data & various information about the device the picture was taken on. Also note the mapped location of where it was taken. To get this information was less than 3 seconds once loaded into the program.

Security Risks & Repercussions

So what are the security risks? Well, as already pointed out the information could reveal any number of things: barracks, bases, patrol points or even patrol patterns. This information not only puts the staff member who uploads the pictures in danger but their entire deployment group.

Potential death is not the only issue, with profiles being insecure it could lead to that one member being profiled by the militant group, this then leading to potential blackmail, kidnap or endangering family members.

What should the armed forces be doing?

There are many things the armed forces could be doing. The key thing to do is offer the training necessary to remind their staff of the issues of geotags and smart phones. They could put a ban on any personal phones completely. However, some service men and woman would still find a way to take them into active duty.

A one hour basic training session that shows the dangers is all that is needed. The session could cover basic security settings of their social networking profiles and turning off the location services on any of their devices.

A one hour session could be the difference between life and death in most cases during deployment.

This article has been geared towards the idea of militant groups, however its not just militant groups, it could be anyone; stalkers, thieves, even an enraged ex could use these techniques.

Image may be NSFW.
Clik here to view.

KS – an open source bash script for indexing data

ABSTRACT: This is a keywords searching tool working on the allocated, unallocated data and the slackspace, using an indexer software and a database storage .

Often during a computer forensics analysis we need to have all the keywords indexed into a database for making many searches on it in a fast way.

We could use strings and grep, for searching the keywords, but we cannot have a database and an engine, then we can’t search them inside many formats, like compressed files, including the ODT, DOCX, XLSX, etc..

So, I tried to solve this problem, first of all we need to extract, what I call “spaces”:

1) Allocated space;

2) Unallocated space;

3) Slackspace;

Then we can run the indexer against these three spaces and we can extract all the keywords inside them.

We must remember that we have two kind of unallocated spaces, the first is all the deleted files and the second is all the files those are not in the deleted set, but they are still on the memory device (hard disk, pendrive, etc.).

For extracting these file we need to use the data carving technique, that consists into the search for the file types by their “magic numbers” (headers and footers), this technique is filesystem-less, so we can gather all files, allocated and unallocated (including the delete files too), so we need to eliminate duplication generated by carving.

The slackspace can be extracted by the TSK (The SleuthKit ) tools and put into a big text file, we have to remember that slackspace is all the file fragments present into the unused cluster space.

Inception

We have to create a directory named, for instance, “diskspace”.

We can mount our disk image file (bitstream, EWF, etc) into a sub-directory of diskspace, e.g. /diskspace/disk and so we can have all the allocated space.

Now, we have to extract all the deleted files including their paths and put them into “/diskspace/deleted”.

We have to run the data carving and put all the results into “/diskspace/carved”, we can use the data carving only on the freespace of the disk and then we must delete the duplicates with the deleted files.

Finally we can extract all the slackspace, if we need it and put it into “/diskspace/slack”.

Now we got:

/diskspace
|_disk
|_deleted
|_carved
|_slack

We only need a “spider” for indexing all these spaces and to collect all the keywords into a database.
For this purpose there is a program in the open source world: RECOLL that indexes a content of a directory and allows various quests. (http://www.lesbonscomptes.com/recoll/)

After the indexing we have all to perform our researches.

All these operations are made by my bash script called KS.sh http://scripts4cf.sourceforge.net/tools.html

KS – This is a keywords searching tool. sudo bash ks.sh for running it. It mounts a DD image file; It extracts all deleted files; slackspace; It makes a data carving on the freespace only; It indexes all by RECOLL.
You need:
The Sleuthkit (last release)
Photorec
MD5Deep
RECOLL
It stores the index DB and the recoll.conf in the chosen output directory.
NEW file formats added and README.txt for the HowTo expand the search range.
Website:
http://scripts4cf.sourceforge.net/tools.html

This is the bash script code:

#!/bin/bash
#
# KS – by Nanni Bassetti – digitfor@gmail.com – http://www.nannibassetti.com
# release: 2.1
#
# It mounts a DD image file or a block device, it extracts all deleted files,
# it makes a data carving on the unallocated space, the it runs recoll
# changing automatically the variables in recoll.conf.
#
# many thanks to Raul Capriotti, Jean-Francois Dockes, Cristophe Grenier,
# Raffaele Colaianni, Gianni Amato, John Lehr

echo -e “KS 2.1 – by Nanni Bassetti – digitfor@gmail.com – http://www.nannibassetti.com \n”
while :
do
echo -e “\nInsert the image file or the device (absolute path): “
read imm
[[ -f $imm || -b $imm ]] && break
done
while :
do
echo “Insert the output directory (absolute path):”
read outputdir
[[ "${outputdir:0:1}" = / ]] && {
[[ ! -d $outputdir ]] && mkdir $outputdir
break
}
done

(! mmls $imm 2>/dev/null 1>&2) && {
echo “0″
echo “The starting sector is ’0′”
so=0
} || {
mmls $imm
echo -e “\nChoose the starting sector of the partition you need to index”
read so
}

HASHES_FILE=$outpudir/hashes.txt # File output hash
DIR_DELETED=$outputdir/deleted # Deleted File’s Folder
DIR_SLACK=$outputdir/slackspace # Slackspace’s Folder
DIR_FREESPACE=$outputdir/freespace # Carved File’s Folder
BASE_IMG=$(basename $imm) # Basename of the image or device

[[ ! -d $outputdir/$BASE_IMG ]] && mkdir $outputdir/$BASE_IMG

off=$(( $so * 512 ))
mount -t auto -o ro,loop,offset=$off,umask=222 $imm $outputdir/$BASE_IMG >/dev/null 2>&1 && {
echo “Image file mounted in ‘$outputdir/$BASE_IMG’”
}

# recovering the deleted files
echo “recovering the deleted files…”
[[ ! -d $DIR_DELETED ]] && mkdir $DIR_DELETED
tsk_recover -o $so $imm $DIR_DELETED

# extracting slack space, comment if you don’t need it
echo “extracting slack space…”
[[ ! -d $DIR_SLACK ]] && mkdir $DIR_SLACK
blkls -s -o $so $imm > $DIR_SLACK/slackspace.txt

# freespace and carving

[[ ! -d $DIR_FREESPACE ]] && mkdir $DIR_FREESPACE || {
rm -R $DIR_FREESPACE
mkdir $DIR_FREESPACE
}

# using photorec to carve inside the freespace

photorec /d $DIR_FREESPACE/ /cmd $imm fileopt,everything,enable,freespace,search

# taking off duplicates from carving directory
echo “taking off duplicates from carving directory…”
[[ $(ls $DIR_DELETED) ]] && md5deep -r $DIR_DELETED/* > $HASHES_FILE
[[ $(ls $DIR_FREESPACE) ]] && md5deep -r $DIR_FREESPACE/* >> $HASHES_FILE
awk ‘x[$1]++ { FS = ” ” ; print $2 }’ $HASHES_FILE | xargs rm -rf
[[ -f $HASHES_FILE ]] && rm $HASHES_FILE

# RECOLL configuration to have a single recoll.conf and xapiandb for each case examined.
echo “RECOLL is indexing…”
rcldir=$outputdir/recoll
recollconf=/$rcldir/recoll.conf
mkdir -p $rcldir/xapiandb

cat > $recollconf << EOF
topdirs = $outputdir
dbdir = $rcldir/xapiandb
processbeaglequeue = 1
skippedPaths = $rcldir $rcldir/xapiandb
indexallfilenames = 1
usesystemfilecommand = 1
indexstemminglanguages = italian english spanish
EOF

recollindex -c $rcldir -z >/dev/null 2>&1
case $(tty) in
/dev/tty*) echo -e “\nStart on terminal from graphic interface the following command:”
echo -e “recoll -c $rcldir\n”
exit 1
;;
*) recoll -c $rcldir >/dev/null 2>&1 &
exit 0
;;
esac

Image may be NSFW.
Clik here to view.

1- RECOLL in action.

The RECOLL allow the search for keywords also working in compressed files and email attachments in short, once indexed all the content you had to be able to search for keywords or phrases, just as you would with Google.
As all the open source projects I have to thank to the collaboration of some friends and developers.

Author
Nanni Bassetti, Digital Forensics Expert, C.A.IN.E. Linux forensic distro project manager, founder of CFI – Computer Forensics Italy, mailing list specialized in digital forensics topics, codeveloper of SFDumper and founder of the web site http://scripts4cf.sf.net.
Personal website: http://www.nannibassetti.com – e-mail: digitfor@gmail.com

Image may be NSFW.
Clik here to view.

As you may already know, Apple has always been criticized for using their extremely popular devices to track users and use this information to expand their own databases. This tutorial assumes that you have already jailbroken your device and you know how to navigate your way through iOS menus, if you don’t then check out our other articles that cover just that. In this small and insightful tutorial, you’ll see just how easy it is to extract photos from an Apple device and use the EXIF data to view the location of where the photo was taken along with other cool details.

Introduction & Prerequisites

Apple devices store much more information than you would ever imagine. It is surprisingly accurate as well, with timestamps to the millisecond and even location data that is frighteningly accurate. The main challenge for the user however, is correctly extracting, preserving and analyzing this information which is where awesome dudes like me come into the picture! After several months of studying the iOS architecture and how things work on an Apple device, I am more than happy to provide the community with bite size chunks of information and that is exactly what I am about to start doing with this first post, aimed at Apple forensics.

So, enough blabbering on about the facts and figures, time to get right down to business right? Well, first you gotta have the right equipment and tools, of course. Here is what you’ll need for this tutorial:

An Apple device – this best works with an iPhone or an iPad but could be a great success on the latest iPod Touch too.
The device has to be jailbroken – cause it is really easy to do and allows us to do so much more with the devices.
Cydia package iFile which can be downloaded from many sources on Cydia.
An extensive EXIF viewer, there are many available however, I prefer this one that is available online.
Some legs, cause the device ain’t gonna walk up a high street itself now, is it?

That is roughly everything that you’re going to be needing in order to pursue this tutorial. Let’s get to it then!

Foreword

I’ll show you what we did during our research and what procedures we followed to get the end result which is of course a picture with the location data plotted on the map that easily allows you to see your whereabouts at certain times. It should be noted that when we carried out this experiment, we took our iPad and walked down a busy high street in the heart of Glasgow, assuming that the iPad would automatically connect to open WiFi networks itself (which it did). We never at any point connected to a network by ourselves, we only had the Camera application open and were taking pictures intermittently. During the following steps, I’ll breakdown exactly what we did, why and how.

Step 1

Take your device out for a stroll, preferably on a street that you know contains many WiFi hotspots (that is if you have non-cellular device such as an iPad Mini WiFi only model), so if you have an iPhone, you should be good to go anywhere because it is always connected to the Internet via radio towers.

Step 2

Take some pictures, at random times, in random places, of random things. Possibly do it with the same technique that we did – 5 pictures on the way down and 5 pictures on the way back. Notice that when you take a picture using the Camera application, the location data icon shows up on the status bar of your device, as shown below:

Image may be NSFW.
Clik here to view.

Location data active icon in the taskbar.

The actual icon may differ from the one I have above however, it will only pop up right after you take a picture using the Camera application. The icon will always show up when the iPad is requesting the use of location services. This can be changed within the Settings application.

Step 3

Once you have a small collection of photos that you took, you can head back in and start extracting them from the iPad. Now, you can always just sync the photos on iTunes and that’ll move them over or use some 3rd party software to transfer them but how about doing it wirelessly? That’s right. With iFile on a jailbroken device you can easily set up a web server that allows you to transfer content over to your computer.

Open up iFile and navigate to the following path:

/var/mobile/Media/DCIM/100APPLE

You’ll be presented with a screen that looks similar to the one below, of course you could have more or less photos, obviously depending on how trigger happy you are with the Camera application.

Image may be NSFW.
Clik here to view.

The image files contained within the folder mentioned above.

Already you can see information such as the size for each file, timestamps and file names. To initiate a web server connection, touch the wireless icon in the bottom center of the screen. This will yield this screen which shows you what to type into your address bar in a browser.

Image may be NSFW.
Clik here to view.

Connection has been created, use the details to access your device wirelessly.

Step 4

Now open up a browser on your laptop or desktop computer (for the love of god, do not use Internet Explorer) and type up the address that is shown on the device into the address bar. This will establish the connection between your computer and the device, enabling you to transfer files (yes, they go both ways) easily and effortlessly. Once you’ve setup the connection, you’ll be presented with this screen on your computer and a confirmation on your device.

Image may be NSFW.
Clik here to view.

This is what you’ll see on your computer once connection has been setup.

Step 5

You can now navigate to the path shown above on the computer and download the photos that you’ll be working with, precisely those that are located at the bottom of the folder. Just make sure that the date and time match that of when you took your initial photos. To save your photos, simply either right click on one and select Save link as… or click on it and repeat the aforementioned step. Save all your photos into one neat folder on your computer, so you can find them easily when it comes to the next step.

Step 6

This is where it begins to get interesting – with the photos extracted and ready, you can start uploading them onto the online EXIF viewer. Go ahead and open that up and upload the first image using the instructions provided on the website.

Image may be NSFW.
Clik here to view.

Uploading your image to the online EXIF viewer is easy, my gran could do it!

Step 7

Once your image is uploaded and the processing is complete, you’ll be presented with the full page of information. Some of this information is useful, and some is not. Have a wonder about and see how much you can understand cause we really need a few important details for the next bit. Notice on the top of the page there is a section that summarizes all the information that we need – a timestamp, longitude and latitude.

Image may be NSFW.
Clik here to view.

This EXIF viewer does a nifty job summarizing all the stuff we need.

Step 8

Go ahead and copy the latitude and longitude that is shown in brackets, you’ll need it for plotting the final coordinates later on. Now all you need to do is rinse and repeat the steps above for the remaining photos that you took, remembering to copy over the coordinates into a text file.

Once you’re done, you’ll essentially have something that looks like the following image. Let’s hope you haven’t been stalking me and your coordinates are wildly different from mine.

Image may be NSFW.
Clik here to view.

Your final list of coordinates, probably different from mine.

Step 9

It’s time to plot this small selection of coordinates (larger list if you’re a photo fiend) on a map, provided by the good old trustworthy Google Maps. Navigate yourself to this website, which plots lists of coordinates with ease and slap in your list. Guess it’s common sense that you need to press the big green button to get anywhere, eh? You’ll start with something like this:

Image may be NSFW.
Clik here to view.

Pretty straightforward, eh?

And you’ll end up with the final result which is shown below:

Image may be NSFW.
Clik here to view.

The final result!

Conclusion

So, we’ve managed to plot the coordinates of the photos taken with an Apple device – this allows us to further explore just how fascinating technology really is and how quickly it is evolving into something that may soon be beyond our control. Even though this probably won’t hold up by itself in a court of law, it could potentially be part of crucial evidence that can be used to prosecute a suspect. I hope you’ve learned something new from this tutorial and this is just the first of many steps of uncovering what else Apple has in store for us.

Image may be NSFW.
Clik here to view.

For more articles, visit our blog!

Image may be NSFW.
Clik here to view.

We would really appreciate it if you like us on Facebook.

Thanks for reading!

Image may be NSFW.
Clik here to view.

The article covers several Android forensic techniques that can be helpful in a variety of situations. The techniques or discussions below can be either logical or physical. However, we will try to stick mostly to logical techniques. By the word ‘logical,’ the technique would mostly involve accessing the file system, etc. This article also assumes that the reader has basic knowledge about Android programming and other concepts related to the Android. Let’s proceed to learn more.

Unlocking a screen locked Android phone / Breaking the Android passcode

Firstly, it’s important to note that every technique comes with some limitation or the other. You will need to figure out which technique would help you depending on the circumstances. Circumventing the passcode may not be always possible. We will take a few scenarios and see how you can take advantage in each case.

There are currently three main types of pass codes supported by Android devices – Pattern lock, PIN and alphanumeric code.

1. Smudge Attack:

This is not specific to any Android device but used generally by forensic analysts where they can deduce the password of a touch screen mobile. The attack depends on the fact that smudges are left behind by the user’s fingers due to repeated swiping across same locations. The pattern lock or any pass code is something that the user will have to swipe every time that he wants to use his mobile. We can infer that the smudges would be heaviest across the same locations and hence under proper lighting and high-resolution pictures we can deduce the code. So during examining any device, forensic analysts usually take care to avoid hand contact with the screen to check for the smudge attack.
Image may be NSFW.
Clik here to view.

(Picture referenced from http://www.elsevierdirect.com site)

2. If USB – debugging is enabled:

If USB debugging in the Android is enabled, then bypassing the lock code can be done in a matter of seconds. Imagine an attacker who wants to get access to his friend’s files and applications on his Android mobile. You can first ask his handset for some false reason, to make a call, for example, and turn on the USB debugging under Settings à Developer Options à USB debugging; and then hand over the mobile back to him. So later, at some convenient time, when you get access to the device, you can exploit it using any of the following ways discussed in this article. Now adb (Android Debugging Bridge) is primarily a command line tool that communicates with the device. ADB is bundled with the Android platform tools. To explain in simple terms, this is what happens when you deal with adb:

An adb daemon runs as a background process on each Android device.
When you install Android SDK on your machine, a client is run. The client can be invoked from shell by giving an adb command.
A server is also run in the background to communicate between the client and adb daemon running on the Android device.

You can use any of the below methods to take advantage of the USB debugging to bypass the screen lock:

Using UnlockAndroid.apk:

Before going ahead with this process you can download the Unlockandroid.apk file from the below location.

URL: http://www.megafileupload.com/en/file/409464/UnlockAndroid-apk.html

Connect the device to the machine where Android SDK (including platform tools etc.) is installed.
Open command prompt and type cd C:\android-sdk-windows\platform-tools>adb.exe devices
The device must be identified by the adb if everything is going fine.
Copy the above UnlockAndroid.apk file into C:\android-sdk-windows\platform-tools directory.
In the command prompt type, C:\android-sdk-windows\platform-tools>adb.exe and install adb.exe UnlockAndroid.apk. Observe that the application is installed on the device.
To start the application just type:

C:\android-sdk-windows\platform-tools>adb.exe shell am start -n com.rohit.unlock

/com.rohit.unlock.MainActivity
Observe that the screen lock is bypassed now you can access all the application and folders in the mobile phone. Below is a screenshot of the process.

Image may be NSFW.
Clik here to view.

Deleting the gesture.key file:

If the Android device is using the pattern lock and it it’s a rooted device, then the process below can be tried to bypass the screen lock.

Connect the device to the machine where Android SDK (including platform tools etc.) is installed.
Open command prompt and type cd C:\android-sdk-windows\platform-tools>adb.exe devices
The device will be identified by the adb if everything is going fine.
Connect to adb shell by typing : adb.exe shell
The terminal appears giving you access to shell. Now type rm /data/system/gesture.key. This is the file where pattern is stored.
Restart the phone and you will still observe that the device is asking for the pattern. You can draw any random pattern and unlock the device.

Below is the screenshot of the process.

Image may be NSFW.
Clik here to view.

Updating the SQLite files:

If the phone is rooted, then by updating the SQLite files you can bypass the screen lock. Here are the details.

cd /data/data/com.android.providers.settings/databases
sqlite settings.db
update system set value=0 where name=’lock_pattern_autolock’;
update system set value=0 where name=’lockscreen.lockedoutpermenantly’;

Cracking the PIN in Android:

We have seen how to bypass the screen lock and how to completely delete or disable the lock screen. But what if we wanted to know the actual PIN so that you can lock/unlock at any time? In Android, the PIN that user enters is stored in /data/system/password.key file. As you might expect, this key is not stored in plain text. It’s first salted using a random string, then the SHA1-hashsum and MD5-hashsum are calculated and concatenated and then the final result is stored. Seems very complicated but not to the modern computing power, as the following code shows:

[plain]
public byte[] passwordToHash(String password)
{ if (password == null)
{ return null;
} String algo = null;
byte[] hashed = null;
try {
byte[] saltedPassword = (password + getSalt()).getBytes();
byte[] sha1 = MessageDigest.getInstance(algo = “SHA-1″).digest(saltedPassword); byte[] md5 = MessageDigest.getInstance(algo = “MD5″).digest(saltedPassword); hashed = (toHex(sha1) + toHex(md5)).getBytes();
} catch (NoSuchAlgorithmException e)
{ Log.w(TAG, “Failed to encode string because of missing algorithm: ” + algo);
} return hashed; }
[/plain]

Since the hash is salted, it’s not possible to use a regular dictionary attack to the get original text. Here are the steps you can follow to try to crack the PIN.

Pull out the salt using adb. Salt is stored in the ‘secure’ table from /data/data/com.android.providers.settings/databases/settings.db)
Get the password : sha1+md5: (40+32) (stored at /data/system/password.key)

Ex: 0C4C24508F0D29CF54FFC4DBC5520C3C10496F43313B4D3ADDFF8ACDD5C8DC3CA69CE740
Once you have the md5 and the salt, you can brute force using the tools available in market (Ex hashcat) to get password.

Data Extraction in Android:

After having seen different ways to bypass the Android screen lock, now let’s have a look at how to extract the data from an Android phone. You can extract the data of all the files on the system or only those relevant files that you are interested in. But for any form of extraction, it’s important that the device is unlocked or USB-debugging is previously enabled. There are two types of extractions.

Extracting through ADB: As explained earlier, adb is a protocol that helps you to connect to Android device and perform some commands.

Boot Loader Extraction: This can be done when the device is in Boot Loader mode. This takes advantage of the fact that during boot loader mode the Android OS will not be running.

Before extracting the data, it is important to know how the data is stored in the Android device to understand where to look, and what data to pull. Android stores the data mainly in the below four locations:

Share Preferences: Data is stored in key-value pairs. Shared preference files are stored in application’s ‘data’ directory in the ‘shared_pref’ folder.
Internal Storage: Stores data that is private in device’s internal memory (something like NAND flash).
External Storage: Stores data that is public in device’s external memory that might not contain security mechanisms. This data is available under /sdcard directory.
SQLite: This is a database that holds structural data. This data is available under /data/data/Package/database.

For example, if you want to analyze the Facebook Android application, here is how you do it. Download and install the Facebook application and sign in. As soon as you install any application in Android, the corresponding application data is stored in /data/data folder. However due to the security restrictions, you cannot access or pull this data unless you have root privileges on the phone. By using adb, let us see what the /data/data folder consists of. As shown in the below fig, a quick ‘ls’ on the /data/data folder gives the below results.

Image may be NSFW.
Clik here to view.

Whether it’s a browser, gallery or contacts, everything is an app in Android. They are the applications that come along with the phone. Application like games, social network apps etc. are the applications installed by the user. But the data belonging to any of these applications will be stored in /data/data folder. So the first step is to identify where your application is.

Image may be NSFW.
Clik here to view.

To see the contents of that application, ‘ls’ into that directory.

Image may be NSFW.
Clik here to view.

As you can see, these are the folders created by the Facebook application on your phone. For instance, the cache folder may consist of images that are cached and stored for faster retrieval during normal browsing. The main area of focus would be the databases folder where the data related to the user would be stored. Here comes the concept of application security. If the application were secure enough, it would take proper steps not to store any of the sensitive data permanently in the databases folder. Let us see what kind of data Facebook stores the when you are currently logged in. You can pull the Android folder into your system using the below command:

C:\android-sdk-windows\platform-tools>adb.exe pull /data/data/com.facebook.katana C:\test

The databases folder must be now copied into the ‘test’ folder in your C drive.

Image may be NSFW.
Clik here to view.

In the ‘databases’ folder, you see DB file types which are the SQLite files where the data is stored. To view the data present in them, you can use a browser such as Sqlite browser. Download and install SQLite browser. In the SQLite browser, click on File -> Open Database and select any of those DB files.

Image may be NSFW.
Clik here to view.

This shows you the database structure and helps you to browse the data stored in them. Now log out of the application, and you might notice that the data present in those tables would be deleted by the application.

To learn about additional Android forensic techniques, check out the mobile forensics course (http://www.infosecinstitute.com/courses/mobile-computer-forensics.html) offered by the InfoSec Institute. So to conclude, in this article we have seen how to bypass the Android screen lock under different conditions and how to extract the application data from Android phone.

Image may be NSFW.
Clik here to view.

Today, terrorists are making the best use of information technology to carry out their objectives. The NATO definition of cyber terrorism is “a cyber attack using or exploiting computer or communication networks to cause sufficient destruction to generate fear or to intimidate a society into an ideological goal” (Everard P, 2007 p 119). Cyber terrorism is achieved using most of the elements of cybercrime. Terrorist groups adopt technologies of two types of goals with reasonable risks. The first goal includes those that improve the organization’s ability to carry out activities relevant to its strategic objectives, such as recruiting and training. The second goal is those that improve the outcome of its attack operations (Bruce W et al, 2007 p 15). To date, there are not any known international code of ethics and law that govern the investigation of transnational and state-sponsored cyber terrorism. The playing field is better prepared for terrorist organizations and world security forces have been holding international conferences and workshops to develop best practices and policies that would enable them to be match fit for any future terrorist strikes. Sadly, little has been achieved. This paper critically examines different international guidelines for investigating transnational and state-sponsored cyber terrorism and the need for an International Code of Ethics and Law in combating these attacks.

Oorn and Tikk named the Cyber Crime Convention (ETS No. 185), which, with the Convention on the Prevention of Terrorism (CETS No. 196), as “the most important international instrument for fighting cyber terrorism and other terrorist use of the Internet”(Oorn R and Tikk E 2007, p 91). Unfortunately, many states are not in agreement with Oorn’s and Tikk’s assertions. Furthermore, at the 11^th meeting of Committee of Experts on Terrorism (CODEXTER) Council of Europe, with regard to possible legal responses to terrorism and cyber terrorism, Professor Sieber outlined in particular the harmonisation of national substantive criminal law and of national procedural law; the improvement of international co-operation; and other important aspects, such as, inter alia, the duty to protect infrastructures/data security certifications and preventive monitoring of data. Having examined the previous conventions, none of the above adequately covers serious threats to commit terrorist acts and this shortcoming is not supported by the capabilities of other international organizations.

In 2001, the European Commission prepared a document entitled “Network and Information Security: Proposal for a European Policy Approach (Ozeren S, 2008). The first two conditions dealt with legislative provisions and training of law enforcement officers to deal with both domestic and transnational criminal activities.

Furthermore, the second condition was augmented at the Tenth United Nations Congress on Crime Prevention and Treatment of Offenders that was held in Vienna, Austria in 2000. It was stressed that the exchange of technical and forensic expertise between national law enforcement authorities was imperative for faster and more effective investigation of such crimes (Tenth United Nations Congress, 2000). From 27 April to 18 May 2007, Estonia came under heavy Distributed Denial of Service (Ddos) attacks. One person residing in Estonia was charged and legal cooperation was requested from Russia, however this request remained unanswered as Russia had not criminalized computer crimes (Oorn R and Tikk E, 2008). As such the mentioned recommendation from Tenth United Nations Congress tends can address issues such as best practices in terms of obtaining and preserving evidence, maintaining the chain of custody of that evidence across borders and may clear up any difference in language issues. This highlights that there is the urgent need for International Law and Code of Ethics in combating cyber terrorism.

Interpol, (International Criminal Police Organization), through the collaboration of experts from members of national computer crime units has developed the Computer Crime Manual, now called the Information Technology Crime Investigation Manual (ITCIM). It is considered a best practice guide for the experienced investigator, and includes numerous training courses in order to share its expertise with other members, a rapid information exchange system which essentially consists of two elements, and preparing training video/CD-ROM for international law enforcement (Interpol, 2003). These best practises were hammered out from the Interpol’s collaboration with five major working parties namely: a)European Working Party on Technology Crime, b) American Regional Working Party on Information Technology Crime, c) African Regional Working Party on Information Technology Crime, d) Asia Steering Committee for Information Technology Crime and e) Steering Committee for Information Technology Crime (Interpol, 2003). These working parties supplies representation from the majority of the continents and much use can be made from their experiences and strengths. As such, the ITCIM can be used as the foundation on which an International Code of Ethics for Cyber Crime and Cyber Terrorism can be built.

The majority of the Conventions have emphasized collaboration and cooperation from other states in combating cyber terrorism. Different world security forces have all put forth policies and key conditions for international law enforcement to exist. However, there is little mentioned on the barriers to achieving international law and the detection and prosecution of transnational and state-sponsored cyber terrorism. Some states are not cooperating.

Professor John Walker, at an international crime conference in London, has accused China of state-sponsored terrorism and has said that the Chinese government was responsible for the ‘Titan Rain’ attacks on the United States and United Kingdom. He further lamented, “No matter how much collaboration you have internationally, if you have a state-sponsored terrorist coming out of China or Russia you are not going to get them. If they are state-sponsored e-criminals they are doing it for a purpose. And you cannot extradite them”(Blincoe R, 2008). But if Russia and China are doing it, so must the West; either to develop counter strategies or to develop their own proactive capabilities. It has been said that the next world war will be fought with bits and bytes, not bullets and bombs (Gaur K 2006). Until such time as both/all nations agree to stop, there will be no stopping this escalation in International State Sponsored Terrorism. To make matters worse, leading internet company Google, has censored its search services in China in order to gain a greater market share and to satisfy the Beijing authorities. This resulted in censored information being disseminated to users. These acts, together with Professor Walker’s assertions justify the need for International Law and Code of Ethics to address state-sponsored and transnational cyber terrorism. To avoid territorial differences in terms of protection and to practical problems in the detection and prosecution of cyber terrorist activities any draft International Law and Code of Ethics must be ratified by all states. However, due to the existence of non-cooperative and communist states, these requirements are long coming.

In conclusion, it appears that only lip service has been paid to combating cyber terrorism. In the absence of an International Code of Ethics and law to combat cyber terrorism, security forces must become better informed of terrorists’ use of the internet and better able to monitor their activities. They must also explore measures to limit the usability of this medium by modern terrorists. Finally, security forces must adhere to individual or selected best practices and guidelines to ensure that the integrity of any evidence seized in the few cases detected is beyond reproach and that successful prosecution may eventually pursue.

Bibliography

Blincoe R (2008) China Blamed for Cyber-Terrorism Retrieved November 14 2009 from http://www.itservicesconnected.co.uk/News/July-2008/China-Blamed-for CyberTerrorism.aspx

Colarik A M (2006) Cyber Terrorism – Political and Economic Implications. IGI Publishing

Don B W, Frelinger D R, Gerwehr S, Landree E, Jackson B A (2007) Network Technologies for Networked Terrorists Assessing the Value of Information and Communication Technologies to Modern Terrorist Organizations, Prepared for the Department of Homeland Security. RAND Retrieved on October 11 2009 from: http://www.rand.org/pubs/technical_reports/2007/RAND_TR454.pdf

Gaur,K. (2006). Cyber warfare – malicious code warfare. Retrieved from http://my.opera.com/kalkigaur/blog/show.dml/450195

Google censors itself for China (2006) Retrieved November 14 2009 from http://news.bbc.co.uk/1/hi/technology/4645596.stm

Ozeren S (2008) Cyberterrorism and International Cooperation: General Overview of the Available Mechanisms to Facilitate an Overwhelming Task (pp 78-82)

Sieber U, Brunst P (2009) Cyberterrorism and Harmonization of Criminal Prosecution

Retrieved November 3 2009 from:

http://www.mpicc.de/ww/en/pub/forschung/forschungsarbeit/strafrecht/cyberterrorismus.htm

Tikk E, Oorn R (2008) Legal and Policy Evaluation: Internation Coordination of Prosecution and Prevention of Cyber Terrorism Responses to Cyber Terrorism. . Amsterdam, IOS Press (pp 64-67)

Weimann G (2008) WWW.AL-Qaeda: The Reliance of al-Qaeda on the Internet, Responses to Cyber Terrorism. . Amsterdam, IOS Press (pp 64-67)

Image may be NSFW.
Clik here to view.

Image may be NSFW.
Clik here to view.
Belkador Dali. “Losing volatile Evidence”.
All rights reserved.

Ephemeral Evidence

Image may be NSFW.
Clik here to view.

Until very recently, it was a standard practice for European law enforcement agencies to approach running computers with a “pull-the-plug” attitude without recognizing the amount of evidence lost with the content of the computer’s volatile memory. While certain information never ends up on the hard drive, such as ongoing communications in social networks, data on running processes or open network connections, some other information may be stored securely on an encrypted volume. By simply pulling the plug, forensic specialists will slam the door to the very possibility of recovering these and many other types of evidence.

The Role of Live RAM Analysis in Today’s Digital Forensics

Capturing and analyzing volatile data is essential for discovering important evidence. Making a RAM dump should become a standard operating procedure when acquiring digital evidence before pulling the plug and taking the hard drive out.

Types of Evidence Available in Volatile Memory

Image may be NSFW.
Clik here to view.

Many types of evidence are available in computer’s volatile memory that can be extracted by analyzing memory dumps. Volatile and ephemeral evidence types include:

Running processes and services;
Unpacked/decrypted versions of protected programs;
System information (e.g. time lapsed since last reboot);
Information about logged in users;
Registry information;
Open network connections and ARP cache;
Remnants of chats, communications in social networks and MMORPG games;
Recent Web browsing activities including IE InPrivate mode and similar privacy-oriented modes in other Web browsers;
Recent communications via Webmail systems;
Information from cloud services;
Decryption keys for encrypted volumes mounted at the time of the capture;
Recently viewed images;
Running malware/Trojans.

Limitations of Volatile Memory Analysis

Realistically, Live RAM analysis has its limitations, lots of them. Many types of artifacts stored in the computer’s volatile memory are ephemeral: they’re here once, gone the next minute. While information about running processes will not go anywhere until they are finished, remnants of recent chats, communications and other user activities may be overwritten with other content any moment the operating system demands yet another memory block.

Investigators should expect to extract remnants of recent user activities, parts and bits of chats and conversations, etc. Essentially, only recent information will still be available in the content of volatile memory.

Collecting Volatile Data that Can Withstand Legal Scrutiny

Legal, organizational and technical aspects of data acquisition are all equally important when acquiring ephemeral evidence.

Image may be NSFW.
Clik here to view.
The choice of tools and methods of capturing volatile data is extremely important. The choice of a wrong tool or an improper use of the right one may render the entire acquisition useless (more on that later). An attempt to use an inappropriate tool may not only fail to produce meaningful results, but to irreversibly destroy existing evidence.

It is essential to realize that acquiring volatile memory will inevitably leave acquisition footprint. While this may seem acceptable to the law enforcement officer performing the acquisition, convincing the court will be a different matter. Proper documentation of every step of the acquisition process is essential for collecting evidence that can withstand legal scrutiny.

Acquisition Footprint

In order to acquire the content of the computer’s volatile memory, the investigator will have to execute a memory dumping tool, thus inevitably leaving an acquisition footprint both in the volatile memory and on the computer’s hard disk. Therefore, it is essential to carefully weigh the benefits of RAM acquisition against such drawbacks, taking into account that dumping live RAM contents might be the only way to obtain certain types of evidence (including, for example, decryption keys used to access to encrypted disk volumes that may contain orders of magnitude more evidence than RAM alone).

Currently, most court systems are ready to recognize the fact that certain footprint is introduced by law enforcement during the acquisition process. For that to be the case, the entire acquisition process must be carefully documented.

Live Box vs. Offline Analysis

Performing analysis of a running computer requires a careful assessment of risk vs. potential benefits. The first step of live box analysis should always involve capturing a memory dump for off-line analysis. Should anything go wrong during the investigation of a running computer, the memory dump can still be analyzed. After taking a memory dump, continuing with live box analysis may be beneficial if, for example, there is certain information stored on remote servers, and a network connection (e.g. a secure VPN connection or an RDP session) is established which may be lost when the computer is plugged off.

Standard Procedure

The official ACPO Guidelines recommend the following standard procedure for capturing a memory dump:

Perform a risk assessment of the situation: Is it evidentially required and safe to perform volatile data capture?
If so, install volatile data capture device (e.g. USB Flash Drive, USB hard drive etc.)
Run the volatile data collection script.
Image may be NSFW.
Clik here to view.Once complete, stop the device (particularly important for USB devices which if removed before proper shutdown can lose information).
Remove the device.
Verify the data output on a separate forensic investigation machine (not the suspect system).
Immediately follow with standard power-off procedure.

Tools and Techniques for Capturing Memory Dumps

A range of tools and methods are available to capture memory dumps. From the forensic perspective, there are certain requirements that any such tool must strictly conform to. In no particular order, the list of essential requirements goes like this.

Kernel-mode operation;
Smallest footprint possible;
Portability;
Read-only access.

Image may be NSFW.
Clik here to view.Kernel-mode operation is essential for a forensic memory capturing tool. With many applications proactively protecting their memory sets against dumping, running a memory acquisition tool that operates in user-mode is simple suicide. At best, such tools will read zeroes or random data instead of the actual information. In a worst-case scenario, a proactive anti-debugging protection will take immediate measures to effectively destroy protected information, then lock up and/or reboot the computer, making any further analysis impossible.

For this not to happen, investigators must use a proper memory acquisition tool running in the system’s most privileged kernel mode. Notably, current versions (as of April 24, 2014) of two popular forensic memory dumping tools, AccessData FTK Imager and PMDump, run as user-mode applications and are unable to overcome protection imposed by anti-debugging systems operating in a privileged kernel mode.

Image may be NSFW.
Clik here to view.

The smaller footprint is left by a memory acquisition tool, the better. Using a tool like that already leaves traces and potentially destroys certain evidence. The less of this, the better.

Memory dumping tools must be portable, ready to run from an investigator-provided device (e.g. USB flash drive or a network location). Tools requiring installation are inadmissible for obvious reasons.

Finally, any sane forensic tool would never write anything onto the disk of the computer being analyzed, will not create or modify Registry values, etc.

Belkasoft makes a tool that complies with all the requirements: Belkasoft RAM Capturer. The tool comes with 32-bit and 64-bit Windows drivers, allowing it to dump proactively protected memory content in kernel mode.

Consequences of Choosing the Wrong Tool

Many types of computer games, chat rooms, encryption programs and malware are known to be using some sort of anti-dumping protection. In mild scenarios (e.g. commercial products and games), an attempt to read a protected memory area will simply return empty or garbage data instead of the actual information.

Image may be NSFW.
Clik here to view.In worst-case scenarios, an anti-debugging system detecting an attempt to read protected memory areas may take measures to destroy affected information and/or cause a kernel mode failure, locking up the computer and making further analysis impossible. This is what typically happens if a user-mode volatile memory analysis tool is used to dump content protected with a kernel-mode anti-debugging system.

The FireWire Attack

Image may be NSFW.
Clik here to view.One technique in particular allows capturing the computer’s RAM without running anything foreign on the system. This technique works even if a computer is locked, or if no user is logged on. The FireWire attack method [1] is based on a known security issue that impacts FireWire / i.LINK / IEEE 1394 links. One can directly acquire the computer’s operating memory (RAM) by connecting through a FireWire link.

What makes it possible is a feature of the original FireWide/IEEE 1394 specification allowing unrestricted access to PC’s physical memory for external FireWire devices via Direct Memory Access (DMA). As this is DMA, the exploit is going to work regardless of whether the target PC is locked or even logged on. There’s no way to protect a PC against this except explicitly disabling FireWire drivers. The vulnerability exists for as long as the system is running. Multiple tools are available to carry on this attack.

Note that the use of this technique has certain requirements. The technique either requires that the computer has a FireWire port and working FireWire drivers are installed (and not disabled) in the system, or makes use of a hot-pluggable device adding FireWire connectivity to computers without one. For example, a PCMCIA/Cardbus/ExpressCard slot in a laptop can be used to insert one of the popular Firewire add-on cards. There is a high probability that the operating system will automatically load the driver for that card, allowing the attacker to use the card for performing a FireWire attack. [3]

Sources such as [3] even describe techniques allowing using an iPhone as a FireWire capturing device!

The “Freezer Attack” on Scrambled Smartphones

An ordinary household freezer has been successfully used to attack encrypted smartphone’s memory content after the phone has been turned off [2].

After the release of Android 4.0, smartphones running the new OS gained the ability to encrypt (scramble) data stored on user partitions. This security feature protects user’s information against attacks bypassing screen locks.

Disk decryption keys are stored in the phone’s volatile memory, and can be retrieved by performing a cold boot, as demonstrated by German researchers. The idea is cooling the smartphone down to a low temperature (about -15 degrees Celsius) in order to slow down the process of RAM contents fading away. Cooled down phones are then reset into “fastboot” mode, then connecting the phone to a PC with custom-developed FROST “fastboot” software installed. The software allows searching for volume decryption keys, perform a RAM memory dump, and crack screen lock keys (4-digit PINs only).

FROST software and step-by-step instructions can be downloaded from https://www1.informatik.uni-erlangen.de/frost

Tools for Analyzing Memory Dumps

At this time, no single forensic tool can extract all possible artifacts from a memory dump. Different tools are used to analyze chat remnants, lists of running processes or extract decryption keys for encrypted volumes mounted at the time of the capture. A brief list of such analysis tools is available below.

Belkasoft Evidence Center [ http://Belkasoft.com/ ] : remnants of conversations and communications occurring in social networks, chat rooms, multi-player online games, Skype; data from cloud services such as Flickr, Dropbox, Sky Drive, Google Drive etc.; communications in Webmail systems such as Gmail, Hotmail, Yahoo; Web browser and virtual worlds artifacts, and so on.

Image may be NSFW.
Clik here to view.

Elcomsoft Forensic Disk Decryptor [ http://elcomsoft.com/ ]: extracts decryption keys protecting encrypted volumes (PGP, True Crypt, BitLocker and Bitlocker To Go containers are supported), allowing investigators to instantly access the content of these encrypted volumes without brute-forcing the original volume password. All the keys from a memory dump are extracted at once, so if there is more than one crypto container in the system, there is no need to re-process the memory dump.

Passware [ http://passware.com ]: forensic toolkit including tools for capturing memory dumps via FireWire attack. Also includes a tool to extract decryption keys for popular crypto containers.

About the Authors

Image may be NSFW.
Clik here to view.

Yuri Gubanovis a renowned computer forensics expert. He is a frequent speaker at industry-known conferences such as EuroForensics, CEIC, China Forensic Conference, FT-Day, ICDDF, TechnoForensics and others. Yuri is the Founder and CEO of Belkasoft. Besides, Yuri is an author of f-interviews.com, a blog where he takes interviews with key persons in digital forensics and security domain. You can add Yuri Gubanov to your LinkedIn network at http://linkedin.com/in/yurigubanov

Image may be NSFW.
Clik here to view.

Oleg Afonin is Belkasoft sales and marketing director. He is an expert and consultant in computer forensics.

Contacting the Authors

You can contact the authors via email: contact@belkasoft.com

About Belkasoft Research

Belkasoft Research is based in St. Petersburg State University. The company performs non-commercial researches and scientific activities.

About Belkasoft

Image may be NSFW.
Clik here to view.

Founded in 2002, Belkasoft is an independent software vendor specializing in computer forensics and IT security software. Belkasoft products back the company’s “Forensics made easier” slogan, offering IT security experts and forensic investigators solutions that work right out of the box, without requiring a steep learning curve or any specific skills to operate.

Belkasoft Evidence Center 2013 is a world renowned tool used by thousands of customers for conducting forensic investigations, as well as for law enforcement, intelligence and corporate security applications. Belkasoft customers include government and private organizations in more than 40 countries, including the FBI, US Army, DHS, police departments in Germany, Norway, Australia and New Zealand, PricewaterhouseCoopers, and Ernst & Young.

References

[1] The FireWire attack method existed for many years, but for some reason it’s not widely known. This method is described in detail in many sources such as http://www.securityresearch.at/publications/windows7_firewire_physical_attacks.pdf or http://www.hermann-uwe.de/blog/physical-memory-attacks-via-firewire-dma-part-1-overview-and-mitigation

[2] FROST: Forensic Recovery Of Scrambled Telephones
https://www1.informatik.uni-erlangen.de/frost

[3] Physical memory attacks via Firewire/DMA – Part 1: Overview and Mitigation (Update) | Uwe Hermann
http://www.hermann-uwe.de/blog/physical-memory-attacks-via-firewire-dma-part-1-overview-and-mitigation

Image may be NSFW.
Clik here to view.

For the last few years we have successfully extracted data from various mobile device, such as cell phones, smartphones, tablets, etc. Among devices to be examined, we came across defective mobile devices (damaged mechanically, by fire or due to being stored in harsh or hostile environmental conditions) from which digital evidence should also be extracted. We have developed several approaches to examining damaged mobile devices which we would like to share with our colleagues.

Image may be NSFW.
Clik here to view.

Fig. 1. A phone with a broken display.

Image may be NSFW.
Clik here to view.

Fig. 2. Nokia C1 that has been exposed to high temperatures.

Image may be NSFW.
Clik here to view.

Fig. 3. A phone that has been stored in harsh environmental conditions. The red indicator shows that the phone has water inside or has been stored in high humidity conditions.

Image may be NSFW.
Clik here to view.

Fig. 4. A phone with mechanical damage (© Aleksey Yakovlev).

Before examining a damaged mobile device, a forensic investigator must determine what exactly is damaged in the device. It is not necessary at all to desolder a memory chip at once and perform any further manipulations on it. Experience has proven that there are usually simpler solutions for extracting data from damaged mobile devices.

Let’s take a look at them.

The most common defect in mobile devices received for forensic examination is a broken display. That is, a mobile device is operational but, because of a broken display, doesn’t show any data. The examination of such mobile devices presents no problems. To examine mobile devices with a broken display, we use UFED (Cellebrite Mobile Synchronization LTD) and .XRY (Micro Systemation). We create a physical memory dump of a mobile device and extract data (a phone book, calls, SMS messages, graphic files, videos, etc.) from it. Sometimes, when available equipment doesn’t support creating a physical memory dump of a mobile device, we perform a logical extraction of data. In this case, a lot of forensic programs for mobile device analysis can be used. For example, Oxygen Forensic Suite (Oxygen Software Company). Moreover, you can always replace a damaged display with a new one. This makes the examination more expensive and time-consuming, but it is often the only possible solution (for example, when examining an Android device with USB Debugging system option disabled).

In some cases, to extract data, we use specialized flasher tools (RIFF Box, Medusa Box, etc.) designed for repairing mobile devices. Such flasher tools use JTAG interface for their work. Using specialized flasher tools, you can extract data from mobile devices which have damaged system software or information protected with a PIN.

Chip swapping. The method consists in extracting a memory chip from a damaged mobile device and installing it into an identical good device. In doing so, you solve several complex problems which would have to be faced should you decide to use a “Chip-Off” technique: there is no need to know the type of a controller used by the device to process memory chip data, the format of memory pages on the chip, the type and features of a file system used by the device, the format in which data is stored (Oh, as soon as you have to manually decode a physical memory dump, you’ll see what we mean!), etc. The drawbacks of the method include the need for a device (preferably two devices) which is identical to the one received for examination. Desoldering a chip is a very complex and laborious task. There is a risk of destroying data due to heat or mechanical damage to the chip. You may also need equipment for reballing. For example, JOVY SYSTEMS JV-RKC – a kit for reballing BGA chips.

When using this method, you cannot rule out the possibility that, after the chip is swapped in the device, all the data on the memory chip will be erased. This often happens when a memory chip controller is installed on the system board as a separate chip. As a rule, structurally it looks like a sandwich: on the one side of the system board there is a memory chip, on the other – a memory controller chip.

Therefore, if you have two identical devices which you can use as “donors”, try to swap their memory chips and look at the device behaviour before examining the device.

In cases where memory chip swapping results in data loss, you should place both the memory chip and the memory chip controller from the damaged device into the donor device.

When examining a damaged device, you should pay attention to the construction of its system board. We examined a Motorola V3 phone which had spent two years in the ground. The phone looked awful. Various oxides had damaged its housing and system board. It was out of order. However, after the phone had been disassembled, it was found that the system board consisted of several parts. A part of the system board, with a memory chip on it, had suffered from environmental exposure the least. To extract the data from this phone, we bought an identical one at an online auction. We swapped a part of the system board with a memory chip in the purchased phone for the part extracted from the damaged phone and read the data.

If none of the above described methods has helped, you’ll have to use a Chip-Off technique.

An investigator who wants to extract data from a mobile device memory chip must follow four main steps:

1) Chip extraction.

2) Extracting data from the memory chip.

3) Flash translation layer (FTL) reconstruction.

4) Dump decoding.

Let’s take a closer look at these steps:

Step 1. Chip extraction.

Chip extraction is a rather simple task: it is sufficient to heat the chip with a hot air stream from a soldering station and separate the chip from the system board. On this step, it is very important not to overheat the chip (this will result in data erasure) and damage it mechanically. Gradually rise the temperature of the hot air.

Step 2. Extracting data from the memory chip.

Our colleagues sometimes ask us, “What flasher tool should be used to extract data from a memory chip of a <mobile device model>?” The question is incorrect. Mobile phone manufacturers can change a chipset of mobile devices even when producing a single batch. That is, when we have two mobile devices from the same batch, we cannot say with confidence that they use identical memory chips. That is why, not knowing what particular chip is used in the mobile device to be examined, you cannot answer the question about the flasher tool, even if you are aware of the phone model. Another piece of bad news is that a mobile device can have several memory chips. You must find all of them.

This step is not difficult provided that you have a flasher tool with an adapter for a necessary type of BGA chip form factor. However, to find such a flasher tool is a great problem. We’ve had a lot of discussions with colleagues about what flasher tool to buy for a Chip-Off technique. A good flasher tool with a large number of adapters for various form factors of BGA chips can cost a fortune. It is unprofitable to spend so much on a device which you will not often use. As a result, we have reached a consensus that, if necessary, we’ll rent such equipment from huge service centres that specialize in electronics repair.

We’d also like to draw colleagues’ attention to the products EPOS FlashExtractor, from the Ukrainian company EPOS, and PC-3000 Flash, from the Russian company ACE Lab. These equipment kits contain adapters for connecting memory chips of various form factors. But you’ll have to solder chips in adapters provided by EPOS and ACE Lab. It is a very complex and laborious task.

Step 3. Flash translation layer (FTL) reconstruction.

FTL reconstruction consists in excluding service areas from memory pages and joining these pages correctly. The above mentioned products, EPOS FlashExtractor (EPOS) and PC-3000 Flash (ACE Lab), help a lot in solving tasks on this step. They have large knowledge bases about data storage structure in various types of memory chips and about various controllers used to manage data stored on chips. Using them, you can also perform FTL reconstruction manually.

We use the following test to assess the dump received at this stage. Any mobile device contains graphics files. These can be files created by users or files used by programs. We think FTL reconstruction has failed if we cannot recover graphics files (or image fragments) larger than 2 KB from the dump.

Step 4. Dump decoding.

Dump decoding is a complex task. Basics of dump decoding are taught at training courses (for example, provided by Cellebrite Mobile Synchronization LTD). However, you shouldn’t think that you’ll handle a physical dump of the phone to be examined as easily as you do a training dump. If XRY (Micro Systemation) or UFED Physical Analyzer (Cellebrite Mobile Synchronization LTD) supports decoding a physical dump for the device you are examining, then you can try to decode the extracted dump using these programs. It is easier to use UFED Physical Analyzer (Cellebrite Mobile Synchronization LTD), as it allows to customize action sequence when processing a physical dump and to write custom modules in Python for physical dump analysis. In addition, investigator’s work on this step is made much easier by the following programs: RevEnge (Sanderson Forensics), Phone Image Carver (GetData Pty Ltd), Cell Phone Analyzer (BKForensics).

With this, we finish the summary of methods and tools used to extract data from damaged phones. We hope this article has been useful for you.

About the Author:	Igor Mikhaylov Independent law enforcement professional from Russian Federation
Interests:	Computer, Cell Phone & Chip-Off Forensics
Contacting the Author:	http://linkedin.com/in/igormikhaylovcf
Site:	http://computer-forensics-lab.org

Image may be NSFW.
Clik here to view.

Are digital images submitted as court evidence genuine or have the pictures been altered or modified? We developed a range of algorithms performing automated authenticity analysis of JPEG images, and implemented them into a commercially available forensic tool. The tool produces a concise estimate of the image’s authenticity, and clearly displays the probability of the image being forged. This paper discusses methods, tools and approaches used to detect the various signs of manipulation with digital images.

Image may be NSFW.
Clik here to view.

How many kittens are sitting on the street? If you thought “four”, read along to find out!

Alexey Kuznetsov, Yakov Severyukhin, Oleg Afonin, Yuri Gubanov

Introduction

Today, almost everyone has a digital camera. Literally billions of digital images were taken. Some of these images are used for purposes other than family photo albums or Web site decoration.

On the rise of digital photography, manufacturers of graphic editing tools quickly catch up momentum. The tools are becoming cheaper and easier to use – so easy in fact that anyone can use them to enhance their images. Editing or post-processing, if done properly, can greatly enhance the appearance of the picture, increase its impact to the viewer and better convey the artist’s message. But where is the point when a documentary photograph becomes fictional work of art?

While for most purposes editing pictures is more than okay, certain types of photographs are never to be manipulated. Digital pictures are routinely handed to news editors as part of event coverage. Digital pictures are presented to courts as evidence. For news coverage, certain types of alterations or modifications (such as cropping, straightening verticals, adjusting colors and gamma etc.) may or may not be acceptable. Images presented as court evidence must not be manipulated in any way; otherwise they lose credibility as acceptable evidence.

Today’s powerful graphical editors and sophisticated image manipulation techniques make it extremely easy to modify original images in such a way that any alterations are impossible to catch by an untrained eye, and can even escape the scrutiny of experienced editors of reputable news media. Even the eye of a highly competent forensic expert can miss certain signs of a fake, potentially allowing forged (altered) images to be accepted as court evidence.

Major camera manufacturers attempted to address the issue by introducing systems based on secure digital certificates. The purpose of these systems was the ability to prove that images were not altered after being captured by the camera. Obviously aimed at photo journalists and editors, this system was also used in legal cases as genuine court evidence. The approach looks terrific on paper. The only problem, it does not work. A Russian company was able to easily forge images signed by a Canon and then Nikon digital cameras. The obviously faked images successfully passed the authenticity test by the respective manufacturers’ verification software.

Which brings us to the question. If human experts are having a hard time determining whether a particular image was altered, and if existing certificate-based authenticity verification systems cannot be relied upon, should we just give up on the very issue?

This paper demonstrates a new probabilistic approach allowing automatic authenticity analysis of a digital image. The solution uses multiple algorithms analyzing different aspects of the digital image, and employs a neural network to produce an estimate of the image’s authenticity, or providing the probability of the image being forged.

1. What Is a Forged Image?

What constitutes a manipulated image? For the purpose of this paper, we consider any modification, alteration or “enhancement” of the image after the image left the camera made with any software, including RAW conversion tools to constitute an altered image.

That said, we don’t consider an image to be altered if only in-camera, internal conversions, filters and corrections such as certain aberration corrections, saturation boost, shadow and highlight enhancements and sharpening are applied. After all, the processing of raw pixel data captured from the digital sensor is exactly what the camera’s processor is supposed to be doing.

Image may be NSFW.
Clik here to view.

How many umbrellas? Read along to find out!

But is every altered image a forged one? What if the only things done to the image were standard and widely accepted techniques such as cropping, rotating or applying horizon correction? These and some other techniques do alter the image, but don’t necessarily forge it, and this point may be brought before the editor or a judge, making them accept an altered image as genuine [1]. Therefore, the whole point of forgery analysis is determining whether any changes were made to alter meaningful content of the image. So we’ll analyze an image on pixel level in order to detect whether significant changes were made to the actual pixels, altering the content of the image rather than its appearance on the screen.

Considering all of the above, it’s pretty obvious that no single algorithm can be used to reliably detect content alterations. In our solution, we are using multiple algorithms which, in turn, fall in one of the two major groups: pixel-level content analysis algorithms locating modified areas within the image, and algorithms analyzing image format specifications to determine whether or not certain corrections have been applied to the image after it left the camera.

In addition, certain methods we had high hopes for turned out to be not applicable (e.g. block artifact grid detection). We’ll discuss those methods and the reasons why they cannot be used.

2. Forgery Detection Algorithms

Providing a comprehensive description of each and every algorithm used for detecting forged images would not be feasible, and would be out of scope of this paper. We will describe five major techniques used in our solution to feed the decisive neural network (the description of which is also out of scope of this paper).

The algorithms made it into a working prototype, and then to commercial implementation. At this time, forgery detection techniques are used in the Forgery Detection plugin [http://forensic.belkasoft.com/en/forgery-detection], an extension of a forensic tool Belkasoft Evidence Center. The plugin can analyze images discovered with Belkasoft Evidence Center, and provide the probability of the image being manipulated (forged).

2.1. JPEG Format Analysis

JPEG is a de-facto standard in digital photography. Most digital cameras can produce JPEGs, and many can only produce files in JPEG format.

The JPEG format is an endless source of data that can be used for the purposes of detecting forged images. The JPEG Format Analysis algorithm makes use of information stored in the many technical meta-tags available in the beginning of each JPEG file. These tags contain information about quantization matrixes, Huffman code tables, chroma subsampling, and many other parameters as well as a miniature version (thumbnail) of the full image. The content and sequence of those tags, as well as which particular tags are available, depend on the image itself as well as the device that captured it or software that modified it.

In addition to technical information, JPEG tags contain important information about the photo including shooting conditions and parameters such as ambient light levels, aperture and shutter speed information, make and model of the camera and lens the image was taken with, lens focal length, whether or not flash was being used, color profile information, and so on and so forth.

The basic analysis method verifies the validity of EXIF tags in the first place in an attempt to find discrepancies. This, for example, may include checks for EXIF tags added in post-processing by certain editing tools, checks for capturing date vs. the date of last modification, and so on. However, EXIF tags can be easily forged; so easily in fact that while we can treat existing EXIF discrepancies as a positive sign of an image being altered, the fact that the tags are “in order” does not bring any meaningful information.

Our solution makes an attempt to discover discrepancies between the actual image and available EXIF information, comparing the actual EXIF tags against tags that are typically used by a certain device (one that’s specified as a capturing device in the corresponding EXIF tag). We collected a comprehensive database of EXIF tags produced by a wide range of digital cameras including many smartphone models. We’re also actively adding information about new models as soon as they become available.

In addition to EXIF analysis, we review quantization tables in all image channels. Most digital cameras feature a limited set of quantization tables; therefore, we can discover discrepancies by comparing hash tables of the actual image against those expected to be produced by a certain camera.

Image may be NSFW.
Clik here to view.

EXIF tags of this image are a clear indication of image manipulation. The “Software” tag displays software used for editing the image, and the original date and time does not match last modification date and time.

2.2. Double Quantization Effect

This algorithm is based on certain quantization artifacts appearing when applying JPEG compression more than once. If a JPEG file was opened, edited, then saved, certain compression artifacts will inevitably appear.

In order to determine the double quantization effect, the algorithm creates 192 histograms containing discrete cosine transform values. Certain quantization effects will only appear on these histograms if an image was saved in JPEG format more than once. If the effect is discovered, we can definitely tell the image was edited (or at least saved by a graphic editor) at least once. However, if this effect is not discovered, we cannot make any definite conclusions about the image as it could, for example, be developed from a RAW file, edited in a graphic editor and saved to a JPEG file just once.

Image may be NSFW.
Clik here to view.

The first two histograms represent a typical file that was only saved once. The other two demonstrate what happens to a JPEG image if it’s opened and saved as JPEG once again.

Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.

These two images look identical, although the second picture was opened in a graphic editor and then saved.

The following histograms make the difference clear.

Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.

2.3. Error Level Analysis

This algorithm detects foreign objects injected into the original image by analyzing quantization tables of blocks of pixels across the image. Quantization of certain pasted objects (as well as objects drawn in an editor) may differ significantly from other parts of the image, especially if either (or both) the original image or injected objects were previously compressed in JPEG format.

Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.

While this may not be a perfect example, it still makes it very clear which of the four cats were originally in the images, and which were pasted during editing. Quantization deviation is significantly higher for the two cats on the left. This effect will be significantly more pronounced if the object being pasted would be taken from a different image.

2.4. Copy/Move Forgery and Clone Detection

An extremely common practice of faking images is transplanting parts of the same image across the picture. For example, an editor may mask the existence of a certain object by “patching” it with a piece of background cloned from that same image, copy or move existing objects around the picture. Quantization tables of the different pieces will look very similar to the rest of the image, so we must employ methods identifying image blocks that look artificially similar to each other.

Image may be NSFW. Clik here to view.	Image may be NSFW. Clik here to view.
Image may be NSFW. Clik here to view.

The second image is fake. Note that the other umbrella is not simply copying and pasting: the pasted object is scaled to appear larger (closer). The third image outlines matching points that allow detecting the cloned image.

Our solution employs several approaches including direct tile comparison across the image, as well as complex algorithms that are able to identify cloned areas even if varying transparency levels are applied to pasted pieces, or if an object is placed on top of the pasted area.

2.5. Inconsistent Image Quality

JPEG is a lossy format. Every time the same image is opened and saved in the JPEG format, some apparent visual quality is lost and some artifacts appear. You can easily reproduce the issue by opening a JPEG file, saving it, closing, then opening and saving again. Repeat several times, and you’ll start noticing the difference; sooner if higher compression levels are specified.

Visual quality is not standardized, and varies greatly between the different JPEG compression engines. Different JPEG compression algorithms may produce vastly different files even when set to their highest-quality setting. As there is no uniform standard among the different JPEG implementations to justify resulting visual quality of a JPEG file, we had to settle on our own internal scale. This was inevitable to judge the quality of JPEG files processed by the many different engines on the same scale.

Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.

This is the same image, only the last three pictures are saved from the original with 90%, 70% and 50% quality respectively. The higher the level of compression is the more visible blocking artifacts become. JPEG is using blocks sized 8×8 pixels, and these blocks become more and more clearly visible when the image is re-saved.

According to our internal scale, JPEG images coming out of the camera normally have apparent visual quality of roughly 80% (can be more or less, depending on camera settings and JPEG compression engine employed by the camera processor). As a result, we expect an unaltered image to fall approximately within that range. However, as JPEG is a lossy compression algorithm, every time a JPEG image is opened and saved as a JPEG file again, there is loss of apparent visual quality – even if the lowest compression / highest quality setting is used.

The simplest way to estimate the apparent visual quality of an existing JPEG file would be applying certain formulas to channel quantization tables specified in the file’s tags. However, altering the tags is all too easy, so our solution uses pixel-level analysis that can “see through” the quantization matrix.

3. Non-Applicable Algorithms

Some techniques sound great on paper but don’t work that well (if at all) in real life. The algorithms described below may be used in lab tests performed under controlled circumstances, but stand no chance in real life applications.

3.1. Block artifact grid detection

The idea is also based on ideas presented in [2] and [3]. However, the algorithm analyzes the result of discrete cosine transform coefficients calculated on a bunch of 8×8 JPEG DCT chunks. Comparing coefficients to one another can supposedly identify foreign objects such as those pasted from another image. In reality these changes turned out to be statistically insignificant and easily affected by consecutive compression when saving the final JPEG image. In addition, discrepancies can easily arise in the original image on the borders of different color zones.

3.2. Color filter array interpolation

Image may be NSFW.
Clik here to view.

Most modern digital sensors are based on the Bayer array.

This algorithm makes use of the fact that most modern digital cameras are using sensors based on a Bayer array. Pixel values of color images are determined by interpolating readings of adjacent red, green and blue sub-pixels [4].

Based on this fact, a statistical comparison of adjacent blocks of pixels can supposedly identify discrepancies. In reality, we discovered no statistically meaningful differences, especially if an image was compressed and re-compressed with a lossy algorithm such as JPEG. This method would probably give somewhat more meaningful results if lossless compression formats such as TIFF were widely used. In real-life applications, the lossy JPEG format is a de-facto standard for storing digital pictures, so color filter array interpolation algorithm is of little use in these applications.

4. Implementation

The algorithms described in this paper made it to a commercial product. They were implemented as a plugin to a forensic tool Belkasoft Evidence Center [http://forensic.belkasoft.com/]. The plugin enables Evidence Center to estimate how genuine the images are by calculating the probability of alterations. The product is aimed at forensic audience, allowing investigators, lawyers and law enforcement officials validate whether digital pictures submitted as evidence are in fact acceptable.

Using Evidence Center equipped with the Forgery Detection plugin to analyze authenticity of digital images is easy. The analysis is completely automated. Sample report looks like the following:

Image may be NSFW.
Clik here to view.

The plugin is available at http://forensic.belkasoft.com/en/forgery-detection.

5. Conclusion and Further Work

We developed a comprehensive software solution implementing algorithms based on statistical analysis of information available in digital images. A neural network is employed to produce the final decision, judging the probability of an image of being altered or original. Some algorithms employed in our solution are based on encoding and compression techniques as well as compression artifacts inherent to the de-facto standard JPEG algorithm. Most alterations performed to JPEG files are spotted right away with high probability.

Notably, our solution in its current state may miss certain alterations performed on uncompressed images or pictures compressed with a lossless codec.

Let us take, for example, scenario in which an editor pastes slices from one RAW (TIFF, PNG…) image file into another losslessly compressed file, and then saves a final JPEG only once. In this case, our solution will be able to tell that the image was in fact modified in some graphic editing software, but will be likely unable to detect the exact location of foreign objects. However, if the pasted bits were taken from a JPEG file (which is rather likely as most pictures today are in fact stored as JPEGs), then our solution will likely be able to pinpoint the exact location of the patches.

6. About the Authors

Image may be NSFW. Clik here to view.	Alexey Kuznetsov is the Head of Department of GRC (Governance Risk Complience) in International Banking Institute. Alexey is an expert on business process modeling.
Image may be NSFW. Clik here to view.	Yakov Severyukhin is Head of Photoreport Analysis Laboratory in International Banking Institute. Yakov is an expert in digital image processing.
Image may be NSFW. Clik here to view.	Oleg Afonin is Belkasoft sales and marketing director. He is an expert and consultant in computer forensics.
Image may be NSFW. Clik here to view.	Yuri Gubanov is a CEO of Belkasoft. Yuri is a renowned computer forensics expert. He is a frequent speaker at industry-known conferences such as CEIC, HTCIA, FT-Day, ICDDF, TechnoForensics and others.

The authors can be contacted by email at contact@belkasoft.com

7. References

1. Protecting Journalistic Integrity Algorithmically http://lemonodor.com/archives/2008/02/protecting_journalistic_integrity_algorithmically.html#c22564

2. Detection of Copy-Move Forgery in Digital Images http://www.ws.binghamton.edu/fridrich/Research/copymove.pdf

3. John Graham – Cumming’s Clone Tool Detector http://www.jgc.org/blog/2008/02/tonight-im-going-to-write-myself-aston.html

4. Demosaicking: Color Filter Array Interpolation http://www.ece.gatech.edu/research/labs/MCCL/pubs/dwnlds/bahadir05.pdf

5. Retrieving Digital Evidence: Methods, Techniques and Issues http://forensic.belkasoft.com/en/retrieving-digital-evidence-methods-techniques-and-issues

Image may be NSFW.
Clik here to view.

Megan Meier was just twelve years old when the events began that would ultimately lead to her death. Like many teenagers, Megan had accounts on common social networks, including MySpace, where she first met “Josh Evans”. Ostensibly a sixteen-year old boy, “Josh” was actually an accumulation of Sarah, an old friend of Megan’s, Sarah’s mother, Lori Drew, and Ashley, a teenage employee of Drew’s. Megan and “Josh” became online friends, and her family were pleased that she seemed generally happier. However, on Monday, 16th of October, 2006, “Josh” sent a message to Megan stating that he no longer wished to continue their friendship. “The world would be a better place without you”, he claimed. Some of the private messages Megan had sent during the course of their online acquaintanceship were posted publicly, and defamatory bulletins were written about her and shared with other members of the site.

Shortly after the friendship came to an end, Megan was found hanged in her bedroom closet.

What moves people to do such things? How big a problem is cyberbullying? And what, if anything, can be done about it?

Bullying is nothing new in our society. In schools and workplaces around the world, some individuals are victimised by people who gain self-validation by bringing others down. With the invention of the internet, however, bullying has taken on a whole new dimension. Now it doesn’t stop in the schoolyard or on the walk home from work; it carries on in your bedroom, sits on your sofa with you when you’re holding your smartphone, and even takes place in your absence, only to be discovered when you next log on.

But isn’t it different from “normal” bullying? Surely, some argue, it must be possible for people to just not have a social networking account, or to change their email address, or just to read a book in the evening instead of turning on the computer?

Perhaps. But in today’s society, technology is everywhere, and setting yourself apart from it can put you at a disadvantage both personally and professionally. For many people today, the line between online and offline life isn’t just blurred, it’s non-existent. Your smartphone alarm wakes you up, and you check your email before you get out of bed. On the way to work, your friends ping messages at you through social networking applications. At work, your inbox fills with business and personal messages. When you go home, you turn on your connected TV and watch it whilst absent-mindedly scrolling through your favourite websites. In this kind of world, cyberbullying isn’t confined to some other realm; it’s going everywhere with you, all the time. And cyberbullying is notoriously difficult to investigate; for one thing, legal jurisdiction in cybercrime is not always easily defined. Cyberbullies may go to great lengths to protect their online anonymity, using public computers and anonymous email resenders to ensure that their own name is not tied to any of the acts.

Carole Phillips is a trustee for BulliesOut, a charity that works to combat bullying both online and offline. She is also a child protection officer who teaches children about Internet safety. We asked her how large a problem cyberbullying really is in today’s society.

“The media attention given to the tragic cases such as the suicide of Hannah Smith and Daniel Perry is only the tip of the iceberg. Ask any school today and they will tell you that at the root cause of any falling out with friends or any bullying problem, you will soon uncover that [social networking sites] are at the centre of the dispute.

Young people today are known as digital natives because they have been raised on the emergence of technology and are very adept at getting to grips with anything new that would take the older generation a little longer to grasp. And therein lies the problem: unless we are professionals who work with young people or in the field in which social media is part of our world too, there has not been the same level of understanding of how social media works and impacts on young people’s lives. With no boundaries or little understanding about ‘how things work’, young people are playing in a lawless society online without the emotional capacity or maturity to deal with issues when things go wrong or an adult to steer them in the right direction.”

A sobering thought. So what can be done? Phillips elaborates:

“In order to move forward and equip people who use social media with the tools to deal with it when things get out of hand, adults such as educators, social workers, youth workers and most importantly parents, need to get to grips with exactly what their children are exposed to, instill levels of morality in them and remind young people over and over again that they should not act online in a way that they would not act in the real world. If you were made to say nasty, vile comments to someone’s face as opposed to someone online, I think you would think twice as you no longer have the veil of anonymity to hide behind.

Schools have a big role to play. Educate all staff on social media, not just ICT teachers or child protection officers; make it a whole school approach and learn how to recognise when things are not going well. Engage parents in training and work together with them and share the responsibility of working together to safeguard young people. Most importantly, teach young people about the consequences of using social media in a negative way as your digital footprint is there forever and you leave a trail behind you that you may not want people to see not only now, but in years to come, be it your parents, families and friends but also potential future employers. The message I would say to all users is act responsibility as you cannot use technology as a way to behave in a way you would not if it did not exist.”

It is evident that cyberbullying is a growing issue in today’s society. Anonymity online can be misused to threaten and victimise others, particularly young people who may spend a lot of time on the Internet and feel high levels of pressure to fit in with their peer group. As digital forensics professionals, the best we can do reactively is to ensure our investigations are thorough and adept. But as Phillips points out, the way to prevention is through education; people need to understand more about the repercussions of their own actions online, and know what to do if they or someone they know are feeling threatened by another party’s online behaviour.

If you are concerned about issues related to cyberbullying, the following organisations can help:

BulliesOut – http://www.bulliesout.com
CyberSmile – http://www.cybersmile.org

Image may be NSFW.
Clik here to view.

If you are like many digital investigators, you’ve heard about the Autopsy™ digital forensics tool and associate it with a course that used Linux to analyze a device. Or, maybe you associate it with a book that made references to the Linux/OS X tool, but it wasn’t applicable to you at the time because you were using Windows. This article is about how Autopsy 3 is different. In fact it is a complete rewrite from version 2 and is now applicable to everyone. It will change the way you think about digital forensics tools.

Runs on Windows and Easy to Use

Let’s start off with the fundamentals: Autopsy 3 runs on Windows with an easy to use, double-click installer. No dependency hells that you may typically associate with open source tools. No esoteric download paths or source code repositories to navigate through. Just download the latest from http://sleuthkit.org/autopsy and run the installer.

Note: We’re also working on the Linux and OS X packages, but Windows has been the primary focus. Stay tuned for when these are available.

Autopsy 3 has been developed with an overarching goal of providing an intuitive layout and workflow. For instance, all analysis results are found in a single tree on the left-hand side rather than strewn about in several areas.

Image may be NSFW.
Clik here to view.

Autopsy has wizards to guide you through each step of the process and has many interface features to make your investigations faster.

For example:

When you find a file from a keyword search or hashset hit, you can right click on it to view its parent directory to see what else is near that file.
Back and Forward history buttons to allow you to backtrack when you realize that your investigation went down the wrong path.
The “views” node in the main evidence tree contains many common file type, size and date based filters to quickly and easily view files that meet these criteria.

Familiar Features and Fast Results

Now that we’ve covered that Autopsy 3 is more applicable than you may remember, let’s cover how it can help you. It has the standard set of features that you need from a digital forensics tool and most of the features you’ll find in commercial offerings:

File system analysis and recovery using The Sleuth Kit™, which has support for NTFS, FAT, Ext2/3/4, Yaffs2, UFS, HFS+, ISO9660

Indexed Keyword Search using Apache SOLR (More…)

Hash database support for EnCase, NSRL, and HashKeeper hashsets.

Registry analysis using RegRipper

Web browser analysis for Firefox, Chrome, Safari, and IE including automated discovery of bookmarks, history, and web searches

Thumbnail views and video playback

MBOX Email analysis

Visual Timeline analysis (More…)

Tagging and Reporting in HTML and Excel

Coming Soon: 64-bit support and Scalpel integration for carving

Autopsy is also built to give you fast results. As soon as you add an image to a case, the analysis begins and continues in the background. As soon as a hash hit is found, you’ll know about it. You won’t need to wait until the entire drive is done. Autopsy prioritizes how it analyzes the files to focus on user content first.

Extensible and Evolves With Your Needs

Autopsy 3 was designed to be a platform for 3rd-party modules. Development began after the first Open Source Digital Forensics Conference in 2010 when discussions highlighted the need for a platform that would allow a user to perform an end-to-end investigation using open source tools. People were tired of needing to use several stand-alone tools with different input requirements and report formats to perform an investigation. Autopsy 3 was developed to be that platform.

Autopsy 3 has several frameworks in its design to allow other developers to write plug-in modules. Here are some examples:

Ingest modules run on the disk images and logical files to extract evidence and artifacts from them. Many of the features previously listed, such as keyword search and hashset analysis, are implemented as ingest modules.

Content viewer modules display a file to the user in different ways, such as Hex, Video playback, or static analysis of an executable.
Report modules create a final report for the investigation

We know Autopsy 3 can’t solve everyone’s problems straight out of the box and we want developers to write modules instead of stand-alone tools. Writing modules is easier than stand-alone tools because the Autopsy platform takes care of all the boilerplate forensics development, like knowing about disk images versus logical files, UIs, and reporting.

If you are a developer, we have full module writing documentation and sample modules . To motivate you a bit more, Basis Technology is organizing an Autopsy module writing competition. Developers have until Oct 21, 2013 to write a module and the attendees of the 4th Annual Open Source Digital Forensics Conference will get to vote who gets the cash prize.

Free

You can download the Autopsy installer and get up and running on your Windows machine from http://www.sleuthkit.org/autopsy/ or you can visit the source code repository at https://github.com/sleuthkit/autopsy and see the inner workings, repackage, and improve the software.

Note: We are also planning on a developer focused article, so stay tuned for that.

If you run into any problems or have questions, submit them to the sleuthkit-users email list. If you have any feature ideas, then submit them to the github issue tracker.

Image may be NSFW.
Clik here to view.

Introduction

A geo-tagged image is an image which holds geographical identification metadata. This data consists of latitude and longitude co-ordinates (sometimes altitude also). Though there are some extremely powerful tools available for extracting geo-tag information from geo-tagged images but the insight knowledge of how a tool actually works and gets the data for us is always a plus.

We know validation is core of any forensics. One may use other tools and/or manually extract the data at byte level to validate the findings. This article exhibits how to go about parsing geo-tags of the images at byte level.

Collection of Geo-tag images

For the purpose of this experiment we took pictures from multiple devices such as

- iPhone 3GS

- iPod 4

- Nessus 7

- LG Optimus

- HTC

We turned on the location service on every device in order to capture geo-tagged pictures.

It was interesting to note that in Nessus 7 did not have local camera application so the application ‘Cameringo’ was installed which has a feature to attach geo-tag in the pictures. Not all camera applications have this feature.

Geo-tag analysis of the images

- Easiest and quickest way of checking geo location of a picture in Windows OS is to right click on the picture and select properties. Then under the detail tab, you will find the info.

- One can also use tool to extract geo-tag and other metadata, for instance, exiftool is a free powerful tool.

- One might have to validate or deal with cases where automatic recovery of geo-tag is not possible and manual parsing of raw image is required. This is what the article focuses on.

Manually parsing raw image

All devices that are used in this experiment found to have EXIF (Exchangeable Image File Format) standard of metadata logging.

Since the length and content of metadata (for example, make and model of camera, software, author, time etc.) vary from device to device, it is not surprising to see different starting offsets of geo-tag data. In other words, we could not find the consistency in the location offset of geo-tag in the image. But we did find a pattern to understand and get closer to the geo-tag offsets.

Scheme

We used following approach in our analysis.

1 Find the two direction letters i.e. N, S, E or W.

2 Then look proceeding offsets values for the pattern 00 00 xx xx 00 00 00 01 00 00 xx xx 00 00 00 01 a00 00 00 xx xx 00 00 64 or 00 00 03 E8. There is no predefined offset location to start with. The values are usually in big endian but we encountered a case involving little endian and reverse reading order.

3 Do your calculation and convert those values to something that makes sense.

Couple of things to keep in mind before jumping to calculation part are

- Read set of 4 bytes for every value

- The hex values given could be in big endian or little endian

- 00 00 00 64 = 100 decimal (Don’t convert it every time – a friendly advice J )

- 1 minute = 60 seconds

- 00 00 03 E8 = 1000 decimal

4 Use the direction letter for latitude and longitude respectively in order.

Proceeding sections demonstrate parsing of images from different sources.

Tools used

WinHex (any basic hex editor can be used)

Calculator (native application in Windows OS)

Image from iPhone

Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00001120 32 30 31 33 3A 30 35 3A 33 30 20 31 34 3A 30 39 2013:05:30 14:09

00001136 3A 31 39 00 32 30 31 33 3A 30 35 3A 33 30 20 31 :19 2013:05:30 1

00001152 34 3A 30 39 3A 31 39 00 00 00 31 8D 00 00 05 B1 4:09:19 1 ±

00001168 00 00 10 B9 00 00 05 A1 00 00 26 0D 00 00 05 26 ¹ ¡ & &

00001184 00 00 00 4D 00 00 00 14 03 FF 02 FF 02 66 02 66 M ÿ ÿ f f

00001200 00 0A 00 01 00 02 00 00 00 02 4E 00 00 00 00 02 N

00001216 00 05 00 00 00 03 00 00 02 CE 00 03 00 02 00 00 Î

00001232 00 02 57 00 00 00 00 04 00 05 00 00 00 03 00 00 W

00001248 02 E6 00 05 00 01 00 00 00 01 00 00 00 00 00 06 æ

00001264 00 05 00 00 00 01 00 00 02 FE 00 07 00 05 00 00 þ

00001280 00 03 00 00 03 06 00 10 00 02 00 00 00 02 54 00 T

00001296 00 00 00 11 00 05 00 00 00 01 00 00 03 1E 00 1D

00001312 00 02 00 00 00 0B 00 00 03 26 00 00 00 00 00 00 &

00001328 00 1C 00 00 00 01 00 00 00 25 00 00 00 01 00 00 %

00001344 0B F4 00 00 00 64 00 00 00 51 00 00 00 01 00 00 ô d Q

00001360 00 17 00 00 00 01 00 00 15 53 00 00 00 64 00 00 S d

00001376 34 E9 00 00 01 12 00 00 00 12 00 00 00 01 00 00 4é

00001392 00 09 00 00 00 01 00 00 00 13 00 00 00 01 00 04

00001408 1D 6F 00 00 03 74 32 30 31 33 3A 30 35 3A 33 30 o t2013:05:30

00001424 00 00 FF E1 02 B0 ÿá °

Step 1

Scroll down till you find direction letter which is usually present near the date/time stamp. We found N and W.

Step 2

Look for the pattern 00 00 00 01 00 00 xx xx 00 00 00 01 … 00 00 00 64. Once the pattern is identified, find 4 bytes before the first set of 00 00 00 01 (highlighted with Turquoise) i.e. byte offset 1326 here.

Step 3

This is the most interesting step of calculation.

a) Let’s start converting hex to dec from offset 1326.

00 00 00 1C => 28 and next 4 bytes are 00 00 00 01 => 1 (Divisor)

Thus the value of degree is 28/1 = 28.

Go to next 4 byte set whose value corresponds to 00 00 00 25 => 37, followed by another set of 00 00 00 01 => 1 (divisor). Thus 37/1 = 37 is the value of minutes.

Now calculating decimal value of seconds. Convert next 4 byte set 00 00 0B F4 => 3060. This set is followed by 00 00 00 64 => 100 (divisor). This time when you divide, you get, 3060/100 = 30.60 seconds.

This completes the latitude calculation which is 28:37:30.60

b) We will continue, reading and converting the hex values for longitude and altitude.

00 00 00 51 => 81, next we have, 00 00 00 01 => 1. Degree is 81/1 = 81.

Then we have 00 00 00 17 => 23 and 00 00 00 01 => 1. Minutes comes out to be 23/1 = 23.

For seconds we have, 00 00 15 53 => 5459. Divisor is 00 00 00 64 => 100. Thus seconds is 5459/100 = 54.59.

Thus longitude is 81:23:54.59

c) Similarly calculate altitude. The value we have in hex is 00 00 34 E9 => 13545. Divisor is 00 00 01 12 => 274. Altitude becomes 13545/274 = 49.4343

Step 4

Finally assigning the direction in order and co-relate the information from what properties is displaying in the below capture.

Latitude: 28:37:30.60 N

Longitude: 81:23:54.59 W

Altitude: 49.4343

Image may be NSFW.
Clik here to view.

Image from iPod

Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00000512 00 00 00 05 32 30 31 33 3A 30 38 3A 32 32 20 31 2013:08:22 1

00000528 33 3A 32 37 3A 30 37 00 32 30 31 33 3A 30 38 3A 3:27:07 2013:08:

00000544 32 32 20 31 33 3A 32 37 3A 30 37 00 00 00 E0 FF 22 13:27:07 àÿ

00000560 00 00 30 FC 00 00 12 ED 00 00 07 7E 00 00 07 8E 0ü í ~ Ž

00000576 00 00 07 25 00 00 00 4D 00 00 00 14 00 07 00 01 % M

00000592 00 02 00 00 00 02 4E 00 00 00 00 02 00 05 00 00 N

00000608 00 03 00 00 02 9A 00 03 00 02 00 00 00 02 57 00 š W

00000624 00 00 00 04 00 05 00 00 00 03 00 00 02 B2 00 05 ²

00000640 00 01 00 00 00 01 00 00 00 00 00 06 00 05 00 00

00000656 00 01 00 00 02 CA 00 07 00 05 00 00 00 03 00 00 Ê

00000672 02 D2 00 00 00 00 00 00 00 1C 00 00 00 01 00 00 Ò

00000688 0D 90 00 00 00 64 00 00 00 00 00 00 00 01 00 00 d

00000704 00 51 00 00 00 01 00 00 04 D5 00 00 00 64 00 00 Q Õ d

00000720 00 00 00 00 00 01 00 00 38 FC 00 00 01 B5 00 00 8ü µ

00000736 00 11 00 00 00 01 00 00 00 1A 00 00 00 01 00 00

00000752 13 3C 00 00 00 64 00 06 01 03 00 03 00 00 00 01 < d

00000768 00 06 00 00 01 1A 00 05

Step 1

Look for direction letters (yellow highlighted).

Step 2

Identify the pattern (green highlighted).

Step 3

a) Start with the set of 4 bytes, offset 678-681 (highlighted Turquoise).

00 00 00 1C => 28

Next 4 bytes 00 00 00 01 gives divisor, 28/1= 28 degrees

Count from offset 690-693, 00 00 0D 90 => 3472

Again the next 4 bytes gives divisor 00 00 00 64 => 100 and the value of minute will be 3472/100 = 34.72.

Note that we have minutes in decimal and we know that

I degree = 60 minutes

I minute = 60 seconds.

Therefore converting minutes to seconds as 0.72 * 60 = 43.2 seconds.

Since we have already calculated seconds by minutes here, next 8 bytes are of no use which is 00 00 00 00 00 00 00 01.

That completes our latitude calculation which is 28:34:43.2

b) Moving forward to longitude bytes located at offset 702.

00 00 00 51 => 81 and 81/1 (next 4 bytes 00 00 00 01) = 81

Next, 00 00 04 D5 => 1237 and 1237/100 (next 4 bytes are 00 00 00 64) = 12.37.

Minutes = 12

Seconds = 0.37*60 = 22.2

Again, since the seconds has been calculated from the minutes, there is no separate 8 bytes for seconds calculation and you will see ‘00 00 00 00 00 00 00 01’ there.

Longitude is 81:12:22.2

c) For altitude,

00 00 38 FC => 14588

Divisor is 00 00 01 B5 => 437

14588/437 = 33.38215

Step 4

Use the direction letters,

Latitude: 28:34:43.2 N

Longitude: 81:12:22.2 W

Altitude: 33.38215

Finally co-relating the findings from the one we get in picture properties.

Image may be NSFW.
Clik here to view.

Image from Nessus

Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00000144 00 00 32 30 31 33 3A 30 38 3A 32 32 20 31 33 3A 2013:08:22 13:

00000160 33 34 3A 30 36 00 43 61 6D 65 72 69 6E 67 6F 20 34:06 Cameringo

00000176 44 65 6D 6F 00 00 04 00 01 00 02 00 02 00 00 00 Demo

00000192 4E 00 00 00 04 00 05 00 03 00 00 00 E0 00 00 00 N à

00000208 03 00 02 00 02 00 00 00 57 00 00 00 02 00 05 00 W

00000224 03 00 00 00 F8 00 00 00 00 00 00 00 51 00 00 00 ø Q

00000240 01 00 00 00 0C 00 00 00 01 00 00 00 6C 54 00 00 lT

00000256 E8 03 00 00 1C 00 00 00 01 00 00 00 22 00 00 00 è “

00000272 01 00 00 00 CA A0 00 00 E8 03 00 00 02 00 01 02 Ê è

00000288 00 04 00 01 00 00 2E 01 00 00 02 02 04 00 01 00 .

00000304 00 00 00 00 00 00 00 00 00 00 FF ÿ

Step 1

Look for direction letters (yellow highlighted).

Step 2

Identify the pattern (green highlighted).

Step 3

Reading of bytes is totally reversed here. One may have to read backwards to get the latitude and longitude. In addition, the values are saved in little endian.

a) Starting with offset 283 and going backwards. Thus from 283 – 280 (4 bytes) we have,

00 00 03 E8 (divisor) => 1000

From offset 279 – 276, 00 00 A0 CA => 41162

41162/1000 => 41.162 seconds

00 00 00 01 (divisor) = 1

00 00 00 22 => 34

34/1 = 34 minutes

Then we have 00 00 00 01 = 1

00 00 00 1C = 28

28/1 = 28 degrees

We got our latitude = 28:34:41.162

b) Continue going backwards where we left from,

00 00 03 E8 => 1000

00 00 54 6C => 21612

21612/1000 = 21.612 seconds

Then 00 00 00 01 is the divisor

00 00 00 0C => 12

12/1 = minutes

Next we have 00 00 00 01 (divisor) and

00 00 00 51 => 81

81/1 = 81 degrees

Longitude becomes

81:12:21.612

Step 4

With direction,

Latitude: 28:34:41.162 N

Longitude: 81:12:21.612 W

Finally matching the extracted information with the one windows identified locally, as shown below.

Image may be NSFW.
Clik here to view.

Image from LG

Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00001216 00 00 03 00 00 00 01 00 00 00 00 00 00 00 01 00

00001232 00 01 CC 00 00 00 64 00 00 03 E8 00 00 00 01 00 Ì d è

00001248 00 00 00 00 00 00 00 00 01 00 00 00 00 FF FF 00 ÿÿ

00001264 08 00 01 00 02 00 00 00 02 4E 00 00 00 00 02 00 N

00001280 05 00 00 00 03 00 00 06 5D 00 03 00 02 00 00 00 ]

00001296 02 57 00 00 00 00 04 00 05 00 00 00 03 00 00 06 W

00001312 75 00 05 00 01 00 00 00 01 00 00 00 00 00 06 00 u

00001328 05 00 00 00 01 00 00 06 8D 00 07 00 05 00 00 00

00001344 03 00 00 06 95 00 1D 00 02 00 00 00 0B 00 00 06 •

00001360 AD 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001376 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001392 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001408 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001424 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001440 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001456 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001472 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001488 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001504 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001520 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001536 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001552 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001568 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001584 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001616 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001632 00 00 00 00 00 00 00 00 00 00 00 00 1C 00 00 00

00001648 01 00 00 00 22 00 00 00 01 00 00 A2 21 00 00 03 “ ¢!

00001664 E8 00 00 00 51 00 00 00 01 00 00 00 0C 00 00 00 è Q

00001680 01 00 00 56 FE 00 00 03 E8 00 00 00 12 00 00 00 Vþ è

00001696 01 00 00 00 11 00 00 00 01 00 00 00 17 00 00 00

00001712 01 00 00 00 18 00 00 00 01 32 30 31 33 3A 30 38 2013:08

00001728 3A 32 32 00 00 08 01 00 00 04 00 00 00 01 00 00 :22

00001744 00 A0 01 01 00 04 00 00 00 01 00 00 00 78 01 03 x

00001760 00 03 00 00 00 01 00 06 00 00 01 1A 00 05 00 00

00001776 00 01 00 00 08 3E 01 1B 00 05 00 00 00 01 00 00 >

00001792 08 46 01 28 00 03 00 00 00 01 00 02 00 00 02 01 F (

00001808 00 04 00 00 00 01 00 00 08 4E 02 02 00 04 00 00 N

00001824 00 01 00 00 13 A8 00 00 00 00 00 00 00 00 00 00 ¨

00001840 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00001856 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Step 1

Look for direction letters (yellow highlighted).

Step 2

Identify the pattern (highlighted green). You must have noticed that in this case, we found the pattern quite farther from the direction letters (at offset 1640) unlike previous examples.

Step 3

a) 00 00 00 1C => 28

00 00 00 01 => 1 (divisor)

28/1 = 28 degrees

00 00 00 22 => 34

00 00 00 01 = 1

34/1 = 34 minutes

00 00 A2 21 => 41505

00 00 03 E8 => 1000

41505/1000 = 41.505 seconds

Thus latitude is 28:34:41.505

b) Continue reading hex,

00 00 00 51 => 81

00 00 00 01 => 1 (divisor)

81/1 = 81 degrees

Next 4 bytes are

00 00 00 0C => 12

00 00 00 01 => 1

12/1 = 12 minutes

00 00 56 FE => 22270

00 00 03 E8 => 1000

22270/1000 => 22.27 seconds

Thus longitude = 81:12:22.27

c) Altitude is 18

Because,

00 00 00 12 => 18

00 00 00 01 => 1

18/1 = 18

Step 4

With direction,

Latitude: 28:34:41.505 N

Longitude: 81:12:22.27 W

Altitude: 18

Finally match the values to verify.

Image may be NSFW.
Clik here to view.

Image from HTC phone

Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00004528 31 31 3A 30 34 3A 31 32 20 31 31 3A 30 37 3A 31 11:04:12 11:07:1

00004544 39 00 32 30 31 31 3A 30 34 3A 31 32 20 31 31 3A 9 2011:04:12 11:

00004560 30 37 3A 31 39 00 00 00 01 EC 00 00 00 64 00 01 07:19 ì d

00004576 00 02 00 07 00 00 00 04 30 31 30 30 00 00 00 00 0100

00004592 00 00 00 00 00 0B 00 00 00 01 00 00 00 03 02 02

00004608 00 00 00 01 00 02 00 00 00 02 4E 00 00 00 00 02 N

00004624 00 05 00 00 00 03 00 00 12 72 00 03 00 02 00 00 r

00004640 00 02 57 00 00 00 00 04 00 05 00 00 00 03 00 00 W

00004656 12 8A 00 05 00 01 00 00 00 01 00 00 00 00 00 06 Š

00004672 00 05 00 00 00 01 00 00 12 A2 00 07 00 05 00 00 ¢

00004688 00 03 00 00 12 AA 00 12 00 02 00 00 00 07 00 00 ª

00004704 12 C2 00 1B 00 07 00 00 00 0F 00 00 12 CA 00 1D Â Ê

00004720 00 02 00 00 00 0B 00 00 12 DA 00 00 00 00 00 00 Ú

00004736 00 26 00 00 00 01 00 00 00 2A 00 00 00 01 00 00 & *

00004752 0A 45 00 00 00 64 00 00 00 4D 00 00 00 01 00 00 E d M

00004768 00 04 00 00 00 01 00 00 0D 7C 00 00 00 64 00 00 | d

00004784 00 00 00 00 00 01 00 00 00 0F 00 00 00 01 00 00

00004800 00 07 00 00 00 01 00 00 00 13 00 00 00 01 57 47 WG

00004816 53 2D 38 34 00 00 41 53 43 49 49 00 00 00 4E 45 S-84 ASCII NE

00004832 54 57 4F 52 4B 00 32 30 31 31 3A 30 34 3A 31 32 TWORK 2011:04:12

00004848 00 00 00 00 00 06 01 03 00 03 00 00 00

Step 1

Look for direction letters (yellow highlighted).

Step 2

Identify the pattern (highlighted green).

Step 3

a) For latitude

00 00 00 26 => 38

38/1 (divisor) = 38 degrees

00 00 00 2A => 42

42/1 = 42 minutes

00 00 0A 45 => 2629

2629/100 (divisor) = 26.29 seconds

Latitude is 38:42:26.29

b) For longitude

00 00 00 4D => 77

77/1 = 77 degrees

00 00 00 04 => 4

4/1 = 4 minutes

00 00 0D 7C => 3452

3452/100 = 34.52 seconds.

Longitude is 77:4:34.52

Step 4

Match the calculated value with the one given by image properties.

Image may be NSFW.
Clik here to view.

References

http://en.wikipedia.org/wiki/Geotagging

Image may be NSFW.
Clik here to view.