Timelining using SQLite Write Ahead Logs

Todays question is: Can we tell when records have been deleted from an SQLite database?

TL;DR – We can provide some time and date information in very particular circumstances using the WAL log file. It can be very time consuming!

Quick warning – This article will assume some knowledge of SQLite databases. I will probably do some further posts on the fundamentals of SQLite databases as I think they are essential for any Digital Investigator. I will be doing posts on SQLite work in general and as I write them I will try and update this one with links to the other information.

Background

This came up as part of an investigation where a number of text messages were deleted from a mobile phone device and it was necessary to attribute time and date information to the deletion. The records were some of the last ones on the device and it had been turned off/very little usage after the deletion.

The work I completed for this case obviously can’t be shared, and it was harder work to create test data for an IOS device for SMS/iMessage database, so throughout this I will be using test data from the Mozilla Firefox browser. The same principles apply to the original use case, though the technical details are slightly different.

Tools used throughout this post:

  • X-Ways Forensics – can use other forensic tools and work around, but template files require X-Ways.
  • SQLite Viewer – SQLiteSpy 1.9.14 (x64) – Download from here
  • XT_SQLiteRecordDecoder 0.3 – One of my creations, available here
  • Some X-Ways Templates – Available on downloads page here

Data used in examples:

  • Firefox WAL Data.vhd – Available here

Starting point

A quick examination of the SQLite database, that the records in question were located in, shows that the records are currently deleted in the database when the SQLite database and associated files (WAL and SHM files) were extracted from the exhibit. However, if just the SQLite database was extracted and viewed, the records were still showing as live in the database.

So why is this happening? Well the SQLite database system uses journaling to keep the integrity of a database if something unexpected happens (power outage to device etc.). There are two methods in use; Rollback Journal and Write Ahead Logs (WAL).

Rollback journals include information to restore a database back to its original state if a transaction fails, write ahead logs contain updates to the database that are then committed to the main database at a later time. In this case, the database that we were examining uses a Write Ahead Log, so Rollback journals will not be discussed further here.

The important information, that allows use to deduce dates and times for record deletions in this case, is that records in the SQLite databases have dates and times as part of the records. In the case of the SMS/iMessage database this would be the sent/received times and with the Firefox database we have last visited/visited times and dates. Without this information we would probably not be able to determine when the records were deleted.

Overview of WAL files

In order to understand how we are going to use these files to associate time and date information, we need to know some important information on how the WAL file works. The WAL file is always in the same folder as the database and has the same name but with “-wal” appended to it.

The SQLite database itself is made up of a series of pages that contain the records, and associated information, that are stored in the database. Notably pages in SQLite start from 1 rather than 0.

The WAL file consists of a header followed by a number of “frames”. Each frame contains a frame header and a page worth of data from the database with the updated contents. The same page can appear multiple times in the same WAL file; each time the page needs to be updated the latest version is found (either in WAL or database file) and then changes made and written to end of the WAL file.

The header of the WAL file has 2 Salt values as does each frame. For a frame to be considered valid, the 2 salts in the frame header must match the 2 in the WAL header.

Below is a very simple of sequence of update events with a diagram.

Basic WAL operations

The diagram above shows 3 update operations to the main SQLite database. The first is an update to page 1 and prior to this, the WAL log can be considered empty. As the page being updated does not already exist within the WAL file, a copy is made to the WAL file and the updates to the page are made in the copy of the page. The data within the SQLite database remains unchanged.

Next an update is made to page 4. Again this page does not exist already within the WAL file and so a copy of the page is made in the WAL file and updates made to the copy of the page.

Lastly, page 1 is updated again. As the page already exists within the WAL file, a copy is made of the earlier version of page 1 within the WAL file and this is moved to a new frame at the end of the WAL file. The new updates are then made to the later copy of the page 1 data.

At the end of this process, we have 3 versions of Page 1; One within the main database, one in the first frame of the WAL file and one within the 3rd frame of the WAL file. We also have 2 copies of page 4 in a similar manner. Each of the copies of the pages are different and the may be records that exist in the Page 1 in the first WAL frame that do not exist in the copy in the 3rd WAL frame.

There is a lot more to WAL files than described here, but this should be sufficient information for us to review the data.

Test data and setup

As I’ve created test data for this particular post, I’ll detail each of the actions taken, including the time, so that we can use it to compare against our results. All actions were taken on the 24/09/2020. Prior to the actions below, a new Windows 10 was set up and Firefox was installed. Once installed, all history was deleted from the database. Throughout the process, copies of the databases we created and these are labelled Point 1-4. When the copies were made is also listed below.

TimeAction
Initial copy of databases taken
0937Accessed BBC news website –
0938Accessed BBC news article
0938Moved to sports section of BBC News
Took a copy of Database and WAL files (Point 1)
0939Deleted access to BBC news article
Took a copy of Database and WAL files (Point 2)
0940Accessed URL of misspelt Dilbert website –
0940Visited Dilbert Website –
0940Accessed About section of Dilbert Website
Took a copy of Database and WAL files (Point 3)
Closed browser
Took a copy of Database (Point 4)
Table 1: Actions taken in Firefox browser/VM

As you can see, this is a very simple timeline of events; accessed a few webpages, deleted one and then accessed a couple more.

So in this instance, we know that the deletion was done at 0939. In this case we have multiple copies of the database as we go through the activity, in reality we would only have the one at most. This is mostly to illustrate how the logs are used but the technique is still possible with just the 1 set of files at Point 3.

Viewing the test data

The database files have been included in Virtual hard disk, so it can be mounted on any Windows Machine and set to read only. The Virtual hard disk can be attached using the Disk Management service, available from the Quick Link menu (Win key+X). It can also be accessed natively by a lot of forensic tools, including X-Ways Forensics.

On the disk, there are 4 folders, the initial SQLite database with no WAL file, Point 1-3 folders with a WAL/SHM file each and a Point 4 folder with no WAL file.

The templates that are shown at the top of this post can be used to quickly decode the hex data if using X-Ways/Winhex (with licence). Otherwise the data can be examined manually in other tools. All the offsets are shown throughout this post.

Before we start examining the files, all of the structures used in SQLite are Big-Endian, make sure that whatever tool you are using is set up to interpret Big-Endian data. I will be using X-Ways Forensics throughout and other tools may display slightly differently.

WAL Structure

So firstly we need to look at the structure of the WAL file (tables taken from SQLite 3 website):

WAL Header Format

OffsetSizeDescription
04Magic number. 0x377f0682 or 0x377f0683
44File format version. Currently 3007000.
84Database page size. Example: 1024
124Checkpoint sequence number
164Salt-1: random integer incremented with each checkpoint
204Salt-2: a different random number for each checkpoint
244Checksum-1: First part of a checksum on the first 24 bytes of header
284Checksum-2: Second part of the checksum on the first 24 bytes of header

Important information that we require from here:

  • Database page size – Then we know how much data is stored in frame
  • Salt 1 & 2 – We can use these to check for valid frames.

To be thorough, we should also use the Checksums to check the integrity of the frames, but as that requires computation, rather than simple matching, we will just use the salts.

Immediately after the WAL header (offset 32/0x20) we have the first frame which starts with a frame header:

WAL Frame Header Format

OffsetSizeDescription
04Page number
44For commit records, the size of the database file in pages after the commit. For all other records, zero.
84Salt-1 copied from the WAL header
124Salt-2 copied from the WAL header
164Checksum-1: Cumulative checksum up through and including this page
204Checksum-2: Second half of the cumulative checksum.

So we know that the size of the frame header to 24 bytes and we have the page size from the WAL header format, adding these together gives us the size of each Frame.

If we apply the template to the first 24 bytes of any of the WAL files (they should all be the same) we get:

This provides us with some useful information, including the SALT 1 & 2 values that will be required to check each frame is valid. It has also provided us with that page size, which is required to figure out the correct frame size, as 32,768 bytes.

Firefox History

With this information in hand, we need to apply it to our Firefox SQLite database and associated files. If you are not familiar with how Firefox stores history in the places.sqlite database, there are many good articles out there that cover it in far more detail than I will. However there are 2 tables in the database that we are going to focus on; moz_places and moz_historyvisits.

The moz_places table contains the URL and other information relating to a specific webpage. The moz_historyvisits contains information relating to when an entry in the moz_places table has been visited, including time/date and how the site was accessed. When the last reference in the moz_historyvisits is deleted for a particular entry in the moz_places table, the corresponding entry in moz_places should also be deleted. As there is only on reference to the deleted record, so we can either examine for changes to the moz_places table or the moz_historyvisits table

Examination of the WAL file

If we start on the first frame of the “Point 1 “WAL file (offset 32/0x20) and apply the template we get the following information:

This tells us that this frame relates to page 2 of the database. This doesn’t immediately help us, as we don’t know if that has any relation to the two tables we are interested in!. The SALT 1 and 2 values match those shown in the header, so we are happy that the frame itself is valid.

So how do we check for pages relating to a particular table? Well for this we need to view the hidden table in the places.sqlite database, the SQLITE_MASTER table. This table exists in all SQLite databases and can be accessed using the standard select query. So running the command “SELECT * FROM SQLITE_MASTER” on the database gives us the following results:

So we can see 1 entry for moz_places and one for moz_historyvisits. The most useful information is the ‘rootpage’ field (4th column) and this gives us the starting page for this database. In this case the root page for moz_places is 4 and moz_historyvisits is 12.

If the tables contained more records, then the data would be split into multiple pages and the data in the root page would show us where the other data was stored. In this case, with a relatively small amount of data, the entire data for that table is stored in the root page. If this were not the case, we would have to identify all the pages associated with the table and examine for all of them.

So one way to find the data for each page would be to look at the header of the first frame header (offset 32/0x20) check if it’s for a page we are interested in. If so review the data, if not jump to the next frame header by jumping (Frame Header Size + Page Size) bytes. This is quite time consuming and there can be a large number of pages per WAL, even in our small example.

The solution is to take advantage of some of the nice keyword search functionality in X-Ways Forensics. It has the option to run searches at set offsets rather than every byte using the “Cond: Offset mod”. So if we search for the hex “00 00 00 04”, Big Endian value 4, at Offset 32 every 32792 (header size + page size) bytes, we will find any frame header that relates to page 4. See below for the options as seen in X-Ways. We know the first header is at 32 bytes due to the WAL header taking up those first 32 bytes. Then we want to jump a full frame and check offset 32 (next frame header) for the value and so on and so forth. Please note that this setting will stay on next time you run a search and I accept no blame for you not finding search hits in your next job because you left this conditional offset on!

If we run this on the “Point 1” database, we would see the following offsets for hits:

If we run this on the “Point 2” database, we would see the following offsets for hits:

As we can see, there is an additional entry at 1082168 that exists in the “Point 2” database but not the “Point 1” database. Also notice that there are references to this page after this offset; will look at these later.

If we go to file offset 1016584 and look at the page data there, using the X-Ways template, we see:

So we have 12 live records in this page. Now if we do the same for the page data stored in the frame starting at 1082168 we get:

So here we can see that the cell pointer at page offset 31,052 has been deleted and the pointer to 30,631 has been moved to the slot that contained the offset. Notice the first deleted cell pointer is also 30,631 as the address is not cleared when it moves.

So going to back to the frame starting at file offset 1016584, if we jump 31,052 bytes from the start of the page header, we get the following hex data:

I’ve highlighted the 1st byte of the data. From here we could decode the SQLite record and view the data. I’m not going to cover how to do this manually in this blog post, it’s almost a post in its own right, but I am going to introduce an X-Tension that I have created to make this process easier.

The XT_SQLiteRecordDecoder X-Tension (link at top of this post) requires you to highlight the start of the SQLite record and it will then decode the rest and highlight the entire record. In needs to be used from the disk/partition view in X-Ways rather than the file view. So if you place your cursor on the first byte of the record and press the partition button on the top left of the hex window you should be taken to partition offset 34,2087,875. Highlight the “82 18” hex values by dragging across the data window. Should look like this:

Hex data with first 2 bytes of record highlighted

Now press the “Run X-Tensions” button on the menu at the top of the X-Ways interface:

This will give you the menu to run the X-Tension:

On your first time running the X-Tension, you will have to provide X-Ways with the location and this can be done by pressing the “+” button on the bottom middle of the dialog box. Once the X-Tension is highlighted, press OK and the following should be displayed in the output window:

[XT] Record ID:12	Record Length:280	Header Length:17
[XT] Record Data:"12","https://www.bbc.co.uk/news/av/business-54268400","Brexit: Why is it so hard to reach a deal? - BBC News","ku.oc.cbb.www.",1,0,0,"100","1600936685597000","KLmlJJvjjSPQ",0,"47356514151165","Time is running out to reach a Brexit deal - what are the obstacles?","https://ichef.bbci.co.uk/images/ic/400xn/p08sbh6q.jpg","5"

Given that we know the structure of the database, we can match each of these values back to a field. The whole record should also be highlighted in the hex window:

Hex data with entire SQLite record highlighted

The value “1600936685597000” is the last visit time and we use this as the time the record was entered into the database. This value is a Firefox timestamp, which is a Unix timestamp in milliseconds rather than seconds. Dividing it by 1,000,000 gives us a Unix timestamp and this one translates to 24/09/2020 at 08:38:05 UTC.

So we now have our time for the record being created in the database.

In order to locate our next timestamp that we can use, we need to look at the point 3 database. If you run the same search for WAL headers relating to Page 4, you should get the following list:

So the frame at file offset 1082168 was the last frame we examined in the Point 2 WAL file, so the next change to the page would be located at file offset 1574048.  You may have noticed that some of the offsets after this offset are different to the previous databases. This is due to previous WAL records, which have been committed to the database, being overwritten with new WAL records. I’ll cover this a little more at the end of this post.

Now if we examine the frame at offset 1574048 we can see that the number of cells is back up to 12 and that the last offset is different. The new data has taken some of the space previously occupied by our deleted record.

If we go to page offset 31,251 we can see the record for the (misspelt) www.dilbrt.com

Decoding this record using the XT_SQLiteRecordDecoder X-Tension gives us the following information:

[XT] Record ID:14	Record Length:82	Header Length:16
[XT] Record Data:"14","http://www.dilbrt.com/","<NULL>","moc.trblid.www.",1,1,1,"2000","1600936808068000","8dLkpvxIP9tC",0,"125510922083568","<NULL>","<NULL>","6",

1600936808068000 again is a Firefox timestamp that translates to 24/09/2020 at 08:40:08.

So now we know when the record was added to the database and when the next record was added. So logically we can deduce that the deletion has to have taken place between 08:38:05 and 08:40:08 on 24/09/2020 (UTC).

We could have proved this in a similar manner using the moz_historyvisits table and matching the visits record to the moz_places url. Potentially we could have narrowed this down further using the record relating to the BBC sports section, which is recorded as being last accessed at 08:30:30 UTC on the same day, using the exact same techniques.

I hope this has been helpful in showing a practical example of how time lining SQLite records using the WAL file is possible (if quite time consuming to do manually!).

P.s: Hits from later in the WAL file

So earlier we discussed that there were hits for the page number after the latest record that we were interested in. So to explain that, we shall have a look at the WAL frame header located at offset 1934760 on the “Point 2” WAL database and compare it the one from offset 1016584 of the same database.

See how the SALT 1 and 2 values don’t match, this is because the WAL frames are from before the last checkpoint operation on the database, where the WAL data is integrated back into the database. The SQLite3 File format specifies that the WAL salt-1 value is incremented by 1 and the salt-2 value is randomized, which is why the salt-1 value in the header at offset 1934760 is 1 larger that value in the header at offset 1016584.

When VM’s go wrong

Welcome to the first (hopefully of many) posts to the new blog. This post is based around an issue that was raised to me that they couldn’t virtualise an exhibit for a case they were working on.

I will mention a number of tools in this post and I want to add I don’t have any vested interest in any of them! These tools just so happen to be the ones available to me and/or my personal preferences.

There are a number of reasons that you might want to create a VM of a device, from wanting to check the settings of a particular program to extracting passwords from browsers such as Google Chrome. Although many of the things that you might want to display from a VM can be done by interpreting a settings file or sifting through registry keys, sometimes it is just so much nicer to have a visual capture of the settings as the user would have seen them.

There have been many posts on how to create a VM from a forensic image file and some very good software has been created to help in this task. This post, however, is dedicated to creating a VM of an exhibit that is NOT in a bootable state at the time of seizure.

So, the investigator has tried creating a VM from scratch using Virtualbox and also tried using Virtual Forensic Computing (VFC), which is usually pretty good at making VM’s when they don’t usually work. The machine would start up but no operating system would boot and there was a Windows 7 (x64) system installed. Time for a little bit of work!

The starting place for our investigation was to check the Master Boot Record. At a very quick glance, it appears to be intact, the boot code looks normal and there are partition entries. However, a closer inspection reveals that the partition is set to inactive.

MBR for disk with start of first partition entry highlighted

Normally, we would expect to see a value of 0x80 for the byte that is set to 0x00. It appears the rest of the entry is intact as well and if you use the X-Ways Template manager to view the data, it matches the location of the start of the NTFS partition.

This behavior is especially odd as the device was showing as being used very close to the time the device was seized, definitely within 12 hours of the last activity on the device. So our first job is to fix the MBR for the VM.

To do this, I use Arsenal Image Mounter from Arsenal Recon. I prefer this to FTK imager as it provides a true mounted disk (similar to Encase PDE but without having to dirty my system with Encase) and that gives some additional capabilities. But when mounting with this tool, we set it to keep a cache of changes made to the disk. Also note what physical drive the image file is mounted as, we will need that later.

Interestingly, as soon as the disk is mounted, it is showing as an uninitialized disk within the Disk Management window. So now we need to change the data on the disk using a hex editor. I’m a big X-Ways Forensics fan and, although X-Ways won’t allow you to change data, Winhex that ships with it allows you to easily change the values.

Run Winhex in administrator mode and open the physical disk. Once this is done, simply navigate to the first byte of the record (offset 446) and type 0x80 into the byte. It should show up as red and then press Ctrl+S to save the change. We are not actually altering evidence here, due to the write cache on the image file, so don’t worry.

Now we are going to try and make a VM out of this newly changed disk. However, given the errors that you can easily get with creating VM’s off virtual disks and using different software, we are going to create a VMDK of the disk using Virtualbox. For those who don’t know about frequent VM errors, the Physicaldrive can be shown as being used and therefore stops the virtualisation software using it. This usually requires a restart of the machine, which is pain as you usually always have something processing.

So to create the VMDK file from the mounted image file, we can use the following command:

cd C:\Program Files\OracleVirtualBox

C:Program FilesOracleVirtualBox>vboxmanage convertfromraw \\.PhysicalDriveX "C:\outputpath\outputfilename.vmdk" --format VMDK

This will usually take between half an hour and a couple of hours depending on your machine and the size of the disk. But once this has been done, we have a VMDK that no longer requires the disk to be mounted and has the primary entry in the MBR marked as active. You can also create a copy of the VMDK file in case we break the original!

Now to create a new VM to try and boot this system. I was using Virtualbox, but VMware player should work equally well, so the screenshots will be for this software. These should provide a rough guide for creating a VM in virtualbox and remeber to select the VMDK we created earlier for the hard drive.

Now, simply starting the VM still gives us an error to say “Operating system cannot be found”. This is clearly a larger issue than the MBR entry being broken. So, in order to try and fix this, we can attach a Windows Installation ISO to the VM. This will give us some repair functionality to try and get this system working. I used a Windows 8.1 ISO for this machine, however any version equal to or newer than the version you are trying to fix should be fine.

Once the VM started it gave us the option to boot from the installation disk by pressing any key. Once the ISO booted, we selected “Repair your computer > Troubleshoot > Advanced Options > Command Prompt”.  This opened the command prompt within the repair tools. 

Now there are several things that can cause the system not to boot correctly. The first is damaged MBR, but we had already fixed that, the next items are all specific to the Windows OS. We have a Boot Configuration Data (BCD) that contains information required to boot the system. This is used in Windows Vista and newer and replaces the boot.ini that was used by previous version. Now we are in a command prompt, we can try and fix this

X:\Sources>bootrec /fixmbr 
The operation completed successfully
X:\Sources>bootrec /fixboot 
The operation completed successfully
X:\Sources>bootrec /rebuildbcd 
Successfully scanned Windows installations.
Total identified Windows installations: 1
[1] C:\Windows
Add installation to boot list? Yes (Y)/No (N)/All(A): Y (select ‘Yes’ by responding with ‘Y’)
The operation completed successfully

The first command fixes the MBR, just in case there was an error we couldn’t see. That might include corrupt assembly code that isn’t easily visible. The second and third commands relate to the BCD and, as we can see from the output, it identified the Windows OS on the disk. After this, we restarted the virtual machine and came up with the same OS no found error.

So restart the machine and back into the Windows Installation disk. This time we navigate through Repair your computer > Troubleshoot > Advanced Options > Startup Repair. It displays the detected operating system(s) that can be repaired.

This then shows us the OS’s that can be seen and we select our Windows 7 operating system as the one that needs repairing. It then runs through its process and restarts the VM. This time, the VM gives us the option to start Windows normally or to start in safe mode. Selecting starting windows normally gives a blue screen.

Still this is a further step forward, and those who deal with VM’s regularly know there are a variety of methods for fixing this kind of error. However by this point I just wanted the thing to boot, so we went back to VFC. By mounting the VMDK that was attached to our Virtualbox machine, we could point VFC at the (now fixed) disk and generate the Virtual Machine. This then booted us straight through to the Windows Login screen. Hooray!

If this hadn’t worked, there were still a couple of options left for us, the most notable was using the Windows boot disc to restore the operating system to an earlier point in time. This would hopefully give us a point at which the system was still in working order.

As you can see, this is a very long winded way of creating a VM, but it does allow us to virtualise a machine that appears. So thanks for reading my first post and hopefully it has been useful, or at the very least, entertaining!

New Blog!

Hi and welcome to the first post of my new blog! I’m hoping this will be the first of many posts, although this will be the only one where nothing forensic/investigative will be posted.

I’ve worked in Digital Forensics/Investigations for a number of years and over this time I’ve created new tools and encountered some weird and unusual problems. With the tools, I’ve mostly distributed them to others that I know within the field, but feel like I should make a larger number of them available to the community as a whole.

This blog came around as in my office its become quite common to “ask clees” when there’s an issue that nobody else seems to have any idea of how to fix. Usually when this happens, I get the person involved to do a little write up of how we fixed it and send it round the office so that everyone can deal with the issue if they should get it (or a similar one).

After solving one problem, a colleague suggested that I should create a blog and post write-ups to forensic problems and that it would allow me to be even easier to point people to any previous work that had been done by pointing to this blog!

So hopefully it won’t be long before the first actual forensic post is up and running, until then enjoy the rest of the site!