Posts Dissecting the AD1 File Format
Post
Cancel

Dissecting the AD1 File Format

Foreword

This article will be covering my personal exploration and dissection of the proprietary AccessData image format known as the AccessData Logial Image. This format is also referred to as AD1 from their extension, and are generated by the popular digital forensics tool; FTK Imager. The research conducted into this file format includes observations made about the overall data structures, which are based on the results of experimentation performed against multiple samples collated solely for the purposes of this article. Therefore, the findings disclosed here should not be treated as an exhaustive, nor conclusive study into the AD1 file format, but rather a foundation with which to build upon.

Introduction

I began my exploration into the AD1 fiile format through a digital forensic challenge of all things. Such challenges are not uncommon to see in the world of DFIR, and I have found them to be a very useful aide when teaching analysts how to interrogate disk images with forensic software. To this end, I began working through some of the forensic challenges present on CyberDefenders, specifically the one named HireMe. This challenge provides an AD1 file with which the player needs to analyse in order to answer a series of questions.

Now, being that I am a Linux-based user, my first reaction was to ascertain whether there was a pre-existing way to extract an AD1 file without using the Windows Operating System. Interestingly, I found a number of online forums discussing whether this was indeed possible; 1, 2, 3, 4. However, they all recommended using Windows-based tools to export the data. On Windows, the examiner has multiple options for extracting AD1 files, which include:

  • Load the AD1 image into FTK Imager and manually export the files
  • Use the Forensic7z plugin for 7-Zip
  • Use Autopsy with a custom AD1 module
  • Use another Windows-based forensic tool (like Paladin) to mount and extract the AD1 data

Interestingly, even after extensive searching online, I could not find a reliable way to extract AD1 data from the Linux command-line. Should anyone reading this know of a CLI tool or method that I am not aware of which can perform these extractions, please let me know.

The AD1 File Format

AD1 files are an AccessData proprietary format described on their official blog as being a “forensic image container1, meaning that they are not very well documented online, which is to be expected. Now as to exactly what a ‘forensic image container’ means in this context was the next phase of my research. Traditional forensic image files, such as DD, AFF or E01 files, typically contain the entire file system structure, including; partition data, slack space, unallocated data, full file metadata, etc. However, from the surface, it appeared that this was not necessarily the case with AD1 files.

Interestingly, according to an official FTK Imager user guide, there are two versions of the AD1 image format, specifically the newer AD1v4, and the older AD1v3. This is significant because older versions of AccessData software are not able to recognise the newer AD1v4 format, but it is possible to convert them into the older format using FTK Imager 3.4.0. Additionally, the documentation also stipulates that any version of FTK Imager starting from version 3.4.2, will only generate AD1v4 format images2.

DISCLAIMER: This article will only be discussing the data structures associated with the newer AD1v4 format, please assume all AD1 files mentioned or used as samples from this point on are of the AD1v4 format.

The primary point to note at this stage is that the AD1 file format is not a traditional forensic image, but rather a container, comprising the logical file data. There is no disk geometry or volume information contained within an AD1 file, meaning it cannot be read by tools such as the Sleuth Kit’s mmls command. To use Brian Carriers3 infamous file system abstraction model; it would be analogous to say that AD1 files contain data which would only reside on the ‘file system’ layer and beyond, having no concept of the data at lower levels. Whereas traditional forensic images (such as the raw DD format), would usually include data from every layer of this model.

Research & Experimentation

AD1 Sample Generation

For the next step in my experimentation process, I needed a good variety of AD1 files to conduct testing on, whose data content I controlled. To this end, I used an existing Windows 10 (Version 20H2) virtual machine in QEMU I had been using for various tests and installed FTK Imager version 4.2.1, which was the version I had to hand at that point in time. It is important to understand that within FTK Imager, there are two methods of generating an AD1 image:

  • Exporting data to AD1 Logical Image
  • Adding data to a Custom Content Image (AD1)

This can be confusing as the above implies that there are two different types of AD1 files, however, they are functionally the same. The main difference between them being that in Custom Content AD1 files, you can add multiple sources of data. Whereas, in the standard AD1 Logical Image export option, you will typically only perform this on a single data source, such as a root directory. In either option, the first screen you will be greeted with is an ‘Evidence Item Information’ box, which prompts for information relating to the case and the examiner.

The data you input into this box is largely irrelevant to the resulting AD1 file, as this data will be stored in an accompanying .txt file. These text files are automatically generated by FTK Imager everytime you create an AD1 file and they contain the information you provided in the aforementioned box, along with:

  • The version of FTK Imager used to create the AD1 file
  • The AD1 file version
  • The AD1 data source(s)
  • MD5 and SHA-1 Hash values
  • Timestamps for the start and end of the acquisition process
  • The file path the AD1 was saved to

After selecting the destination to which the AD1 file will be written, along with its name, there are three additional options available to the examiner; Image Fragment Size, Compression and AD Encryption. Image fragment size is simply a mechanism to allow examiners to split very large acquisitions into more manageable chunks. The default fragment size is 1500 MB and this value is stored in the AD1 header, as we will see later on.

DISCLAIMER: For the experimentation process, I generated multiple AD1 files using various fragmentation sizes. However, split AD1 files are considered to be outside the scope of this article and the research conducted from here on assumes that the fragmentation size is large enough to cover the entire acquisition.

To ensure that I had ample control samples, in addition to the random samples (AD1 files taken from online challenges/other resources), I created two separate folders and gave each of them a unique structure using random files I had on the Windows machine:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Folder 1/
├── Random.sl2
├── TestD1
│   └── JPG_File.jpg
├── TestD2
│   ├── TestD3
│   │   └── Web_File.html
│   └── Text_File.txt
└── TestD4
    └── PNG_File.png

Folder 2/
├── _ctypes_test.pyd
├── TestD1
│   ├── One_Note_File.one
│   ├── TestD3
│   │   └── DLL_File.dll
│   └── TestD4
│       └── Config_File.cfg
└── TestD2
    ├── Log_File.log
    └── TestD5
        └── Python_File.py

Then I used FTK Imager to create five distinct sets of AD1 data:

NOTE: The data in the parentheses below denote the relevant AD1 file names used throughout this article.

  1. AD1 Logical Image of Folder 1 (Logical_F1)
  2. AD1 Logical Image of Folder 2 (Logical_F2)
  3. AD1 Custom Content Image of Folder 1 (F1)
  4. AD1 Custom Content Image of Folder 2 (F2)
  5. AD1 Custom Content Image of Folder 1 and Folder 2 (Combined)

For each set of data, I repeated the acquisition for each compression level, resulting in ten AD1 files for each dataset. I also created a handfull of AD1 files across the datasets at non-default fragmentation sizes, but never a size small enough to cause the resulting AD1 file to be split.

AD1 Compression and Encryption

For the compression level, the examiner has 10 options ranging from level 0 (no compression), to 9 (the highest level of compression). According to the aforementioned user guide, the desired compression level will depend on two factors; time and file size. Should an examiner want an acquisition to complete relatively quickly, they would select a lower level of compression, but will need to deal with a larger file size. Conversely, if an examiner needs to prioritise file space over time taken, they would be better using a higher level of compression.

The final option available to an examiner upon creation of an AD1 file is to enable AD encryption. According to the user guide, AD encryption can be implemented using either a password, or through public-key cryptography. An AD1 file with encryption applied can be identified by its unique file signature as follows:

1
2
# xxd Combined_ENC_C0_1500F.ad1 | head -1
00000000: 4144 4352 5950 5400 0100 0000 0002 0000  ADCRYPT.........

From this we can derive that should an examiner find themselves in possession of an AD1 file with the signature 0x41444352595054, then the AD1 file has been encrypted by FTK Imager. It is also worth noting that the actual encrypted data starts exactly 512 bytes into the AD1 file at offset 0x200 according to my samples. This article will not be covering any further investigation into AD encryption as this will extend beyond the scope of my research.

AD1 Header Section(s)

Now that I had a decent amount of AD1 control samples to hand, I could then move onto the dissection process. I would like to reiterate at this point that there is no extensive online documentation for this file format, as it is proprietary to AccessData. Therefore, I will be making a lot of assumptions about the data I am working with based on the samples I have. Additionally, I will very likely find misleading patterns or misinterpret data, but remember; this is all a normal part of the research process.

Like with any forensic examination, I began my analysis by reviewing the AD1 data at the hexadecimal level, using a combination of the xxd and dd Linux tools. Immediately, the first thing I noted was that each AD1 file appeared to contain a 512-byte header section, quite similar to that of a boot sector of a disk image. Thus I decided to refer to this initial 512-bytes of data as the first ‘AD1 Header’ section. Within this header, the first piece of data is obviously the AD1 signature, consisting of 15-bytes:

1
2
# dd if=F1_C7_1500F.ad1 bs=1 count=15 status=none | xxd
00000000: 4144 5345 474d 454e 5445 4446 494c 45    ADSEGMENTEDFILE

As with any file signature, this part of the data identifies the file type as a legitimate .ad1 file to programs designed to interpret them. Subsequent testing across my samples showed that there was also a 5-byte value located at offset 0x22 within the header, which seems to correlate to the fragmentation size set by the examiner:

1
2
3
4
5
6
7
8
9
10
11
12
# dd if=F2_C0_1500F.ad1 bs=1 count=5 skip=34 status=none | xxd -ps -e | awk '{print $2}'
00005dc0
# dd if=F2_C7_1500F.ad1 bs=1 count=5 skip=34 status=none | xxd -ps -e | awk '{print $2}'
00005dc0
# dd if=F1_C0_2000F.ad1 bs=1 count=5 skip=34 status=none | xxd -ps -e | awk '{print $2}'
00007d00
# dd if=F2_C0_2000F.ad1 bs=1 count=5 skip=34 status=none | xxd -ps -e | awk '{print $2}'
00007d00
# dd if=F2_C9_3000F.ad1 bs=1 count=5 skip=34 status=none | xxd -ps -e | awk '{print $2}'
0000bb80
# dd if=F2_C9_4000F.ad1 bs=1 count=5 skip=34 status=none | xxd -ps -e | awk '{print $2}'
0000fa00

As you can see, the values of these bytes change depending on the fragmentation, which was recorded as per the file name, for example; 2000F corresponding to a fragmentation size of 2000 MB. As for how to interpret these hexadecimal values, they first need to be read in little-endian format, converted into their decimal equivalent, and then divided by 16 to get the fragmentation size. The following table provides a breakdown of the values I used:

HEX (LITTLE-ENDIAN)DECIMALFRAGMENTATION SIZE
00002b1011024689 MB
00005dc0240001500 MB
00007d00320002000 MB
0000bb80480003000 MB
0000fa00640004000 MB

Therefore, we can say that these particular bytes in the AD1 header will be c05d if the fragmentation size is left at the default, which is 1500 MB. It is also worth mentioning that FTK Imager will not let you create an AD1 file if you do not specify a fragmentation size, so these bytes should always be populated. Through further testing, I also discovered that the max value these 5 bytes can comprise is 0x07fffffff0, which equates to a fragmentation size of 2147483647 MB. Any attempt to specify a higher size will simply default back to this maximum value.

Proceeding past the 512-byte header of the AD1 file, the next part of the file is essentially another signature, which consists of a 14-bytes at offset 0x200 and looks like the following:

1
2
# dd if=Combined_C3_1500F.ad1 bs=1 skip=512 count=14 status=none | xxd
00000000: 4144 4c4f 4749 4341 4c49 4d41 4745       ADLOGICALIMAGE

This secondary signature is also consistent across all the samples I tested. From here, I decided to work in 16-byte chunks in an attempt to find patterns in the data and I found that the next piece of tangible data starts at offset 0x210 and typically looks like the following:

1
2
# dd if=F1_C7_1500F.ad1 bs=1 skip=528 count=16 status=none | xxd -ps
04000000010000000000010000000000

Unfortunately, I have no clue what this data could refer to, the only difference in this value appears in the AD1 samples generated as logical images:

1
2
# dd if=Logical_F2_C0_1500F.ad1 bs=1 skip=528 count=16 status=none | xxd -ps
04000000010000000000010070000000

From this, the only conclusion I can potentially infer is that if the single byte at offset 0x21c is set to 0x70, then the AD1 file is a logical image container, whereas a 0x00 value at this offset means it is a custom content container:

1
2
3
4
# dd if=F2_C5_1500F.ad1 bs=1 skip=540 count=1 status=none | xxd
00000000: 00                                       .
# dd if=Logical_F2_C5_1500F.ad1 bs=1 skip=540 count=1 status=none | xxd
00000000: 70                                       p

Again, the next 16-bytes of data, starting at offset 0x220, suffers from the same problem as the previous data:

1
2
3
4
# dd if=F1_C5_1500F.ad1 bs=1 skip=544 count=16 status=none | xxd -ps
0000000079000000000000001d000000
# dd if=Logical_F2_C4_1500F.ad1 bs=1 skip=544 count=16 status=none | xxd -ps
000000008f0000000000000014000000

From the samples I have, I can only conclude at this point that this data refers to yet another signature of some kind identifying the type of AD1 file created; Custom Content or Logical Image. The next 16-bytes of data starting at offset 0x230 are a bit more interesting however:

1
2
3
4
# dd if=F1_C7_1500F.ad1 bs=1 skip=560 count=16 status=none | xxd -ps
414400005c000000000000001a7f4600
# dd if=F1_C7_1500F.ad1 bs=1 skip=560 count=16 status=none | xxd
00000000: 4144 0000 5c00 0000 0000 0000 1a7f 4600  AD..\.........F.

At this point, I quickly threw together a BASH script to allow me to iterate through each of my samples. Running this script, I found that the first 5-bytes of the above data chunk were the same in every sample, which makes sense as AD suggests a signature likely meaning AccessData. However the last few bytes in this row are unique to every AD1 file. For example, here are the unique hexadecimal values from each of my combined-folder custom content AD1 files:

1
2
3
4
5
6
7
8
9
10
Combined_C0_1500F.ad1: 414400005c0000000000000057dd4900
Combined_C1_1500F.ad1: 414400005c0000000000000091404700
Combined_C2_1500F.ad1: 414400005c00000000000000f12b4700
Combined_C3_1500F.ad1: 414400005c00000000000000191e4700
Combined_C4_1500F.ad1: 414400005c00000000000000fb144700
Combined_C5_1500F.ad1: 414400005c00000000000000bd074700
Combined_C6_1500F.ad1: 414400005c000000000000000a054700
Combined_C7_1500F.ad1: 414400005c00000000000000a9044700
Combined_C8_1500F.ad1: 414400005c0000000000000065044700
Combined_C9_1500F.ad1: 414400005c000000000000005d044700

As you can see, the last 4-bytes of hexadecimal data above are unique across the AD1 files, however you may notice that for files C1-C9, the last 2 bytes are the same. The implication being that this value could refer to a common yet slightly different piece of information between the AD1 files. Hence my first two thoughts were that this value was either a timestamp, or the file size, since each one should be smaller than the last. I tested this mini-hypothesis by reading the value from the AD1 file in little-endian, and then converting it into decimal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# dd if=F2_C4_1500F.ad1 bs=1 skip=572 count=4 status=none | xxd -ps -e | awk '{print $2}'
000088d8
# dd if=Logical_F1_C4_1500F.ad1 bs=1 skip=572 count=4 status=none | xxd -ps -e | awk '{print $2}'
00468c32

# hexconv -d 000088d8
DECIMAL:	35032
# hexconv -d 00468c32
DECIMAL:	4623410

# du -b F2_C4_1500F.ad1
35936	F2_C4_1500F.ad1
# du -b Logical_F1_C4_1500F.ad1
4624294	Logical_F1_C4_1500F.ad1

As you can see from cross-examining the output of the ‘disk usage’ (du) command above, these decimal values actually correlate very closely with the size of the AD1 file in bytes. Although they do not appear to match exactly and no sample I tested did, I will assume that this value perhaps refers to the size of the compressed data contained within the AD1 file as a whole. It is likely that these values do not include the header or footer sections of the AD1 file, hence why the actual AD1 size on disk is slightly larger than the size reported within the AD1 file itself.

The final part of this data section before we reach the file structure appears to change depending on the type of AD1 file produced by the examiner; Logical Image or Custom Content. To conceptualise this, compare the two truncated hexadecimal outputs representing each type of AD1 file created for ‘Folder 1’:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# dd if=F1_C3_1500F.ad1 bs=1 count=1000 status=none | xxd
[ . . . ]
00000200: 4144 4c4f 4749 4341 4c49 4d41 4745 0000  ADLOGICALIMAGE..
00000210: 0400 0000 0100 0000 0000 0100 0000 0000  ................
00000220: 0000 0000 7900 0000 0000 0000 1d00 0000  ....y...........
00000230: 4144 0000 5c00 0000 0000 0000 7c90 4600  AD..\.......|.F.
00000240: 0000 0000 0000 0000 0000 0000 b891 4600  ..............F.
00000250: 0000 0000 0000 0000 0000 0000 4375 7374  ............Cust
00000260: 6f6d 2043 6f6e 7465 6e74 2049 6d61 6765  om Content Image
00000270: 285b 4d75 6c74 695d 2900 0000 0000 0000  ([Multi]).......
00000280: 00f9 0000 0000 0000 00c5 0000 0000 0000  ................
00000290: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000002a0: 0005 0000 0014 0000 0046 6f6c 6465 7220  .........Folder 
000002b0: 313a 443a 5c46 6f6c 6465 7220 3100 0000  1:D:\Folder 1
[ . . . ]

# dd if=Logical_F1_C3_1500F.ad1 bs=1 count=1000 status=none | xxd
[ . . . ]
00000200: 4144 4c4f 4749 4341 4c49 4d41 4745 0000  ADLOGICALIMAGE..
00000210: 0400 0000 0100 0000 0000 0100 7000 0000  ............p...
00000220: 0000 0000 8f00 0000 0000 0000 1400 0000  ................
00000230: 4144 0000 5c00 0000 0000 0000 1290 4600  AD..\.........F.
00000240: 0000 0000 0000 0000 0000 0000 4e91 4600  ............N.F.
00000250: 0000 0000 0000 0000 0000 0000 466f 6c64  ............Fold
00000260: 6572 2031 5c44 3a5c 466f 6c64 6572 2031  er 1\D:\Folder 1
[ . . . ]

Starting with the first AD1 file shown in the output above; it appears that this data contains another signature starting at offset 0x25c, with a length of 30-bytes. Drilling down on this signature more closely we can obtain a string with which to compare to the other Custom Content AD1 samples:

1
2
# dd if=F1_C3_1500F.ad1 bs=1 skip=604 count=30 status=none | xxd -ps
437573746f6d20436f6e74656e7420496d616765285b4d756c74695d297d

This hexadecimal value was found to be consistent across all of the custom content samples I have. However, in the logical image samples, it appears that this data is omitted entirely. From what I can gather through some testing with xxd, it looks like the logical image containers omit roughly 80-bytes of data where the Custom Content signature would be (due to padding). Interestingly, the data then continues as normal with what appears to be the root folder name, which can be seen above with the start of the root folder name being ‘Folder 1’.

From this observation, we can see that in the Custom Content samples; the actual file data starts at offset 0x2a9, wheras in the logical image samples, the data starts at offset 0x25c. From these points we see the root folder and the start of a directory data structure, which will consist of the compressed logical folders and files comprising our AD1 file.

AD1 Data Structure(s)

At this point, we have reached the most important part of the AD1 file; the primary data structure which contains the file names, their raw compressed data, as well as their associated metadata. Through some preliminary experimentation with xxd and dd, I discovered that the files/folders are stored almost sequentially within the AD1 file. From first glance, it appears that the structure just descends down the list of files/folders as presented in the tree, but without the branches. For instance, in the ‘Folder 1’ AD1 samples, the arrangement of the files and directories appeared to be as follows:

1
2
3
4
5
6
7
8
9
10
Folder 1/
Random.sl2
TestD1
JPG_File.jpg
TestD2
TestD3
Web_File.html
Text_File.txt
TestD4
PNG_File.png

Further investigation into the AD1 samples I had revealed that the data structure appears to follow a relatively simple sequence for each file and directory present within the given AD1 file. The table below provides a crude representation of the structure sequence, showing the data type and whether said type is present for file and/or folder data.

SEQUENCEDATAFILEDIRECTORY
1NameYesYes
2? Structure Data ?YesYes
3Compressed DataYesNo
4Original SizeYesNo
5MAC TimestampsYesYes
6? Attribute Data ?YesYes
7Hash ValuesYesNo

In AD1 files, directories appear to be treated more akin to a benign file rather than an index, as they have no compressed data, nor hash values. For instance, if we take a look at the hexadecimal data of the TestD1 folder contained within the ‘Folder 2’ (F2_C0_1500F.ad1) structure, we can see this more clearly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
00000320: 0005 0000 0006 0000 0054 6573 7444 3179  .........TestD1y
00000330: 0000 0000 0000 004c 0100 0000 0000 0002  .......L........
00000340: 0000 0002 0000 0001 0000 0033 6101 0000  ...........3a...
00000350: 0000 0000 0300 0000 0300 0000 0100 0000  ................
00000360: 308b 0100 0000 0000 0005 0000 0007 0000  0...............
00000370: 0016 0000 0032 3032 3130 3630 3254 3138  .....20210602T18
00000380: 3338 3031 2e38 3834 3731 34b5 0100 0000  3801.884714.....
00000390: 0000 0005 0000 0008 0000 0016 0000 0032  ...............2
000003a0: 3032 3130 3630 3254 3138 3134 3236 2e33  0210602T181426.3
000003b0: 3030 3030 37df 0100 0000 0000 0005 0000  00007...........
000003c0: 0009 0000 0016 0000 0032 3032 3130 3630  .........2021060
000003d0: 3254 3138 3138 3239 2e34 3430 3533 38f8  2T181829.440538.
000003e0: 0100 0000 0000 0004 0000 000d 0000 0005  ................
000003f0: 0000 0066 616c 7365 1102 0000 0000 0000  ...false........
00000400: 0400 0000 0e00 0000 0500 0000 6661 6c73  ............fals
00000410: 6529 0200 0000 0000 0004 0000 001e 0000  e)..............
00000420: 0004 0000 0074 7275 6542 0200 0000 0000  .....trueB......
00000430: 0004 0000 0002 1000 0005 0000 0066 616c  .............fal
00000440: 7365 5b02 0000 0000 0000 0400 0000 0310  se[.............
00000450: 0000 0500 0000 6661 6c73 6574 0200 0000  ......falset....
00000460: 0000 0004 0000 0004 1000 0005 0000 0066  ...............f
00000470: 616c 7365 0000 0000 0000 0000 0400 0000  alse............
00000480: 0510 0000 0500 0000 6661 6c73 6529 1600  ........false)..
00000490: 0000 0000 0000 0000 0000 0000 0061 1400  .............a..
000004a0: 0000 0000 00d6 0200 0000 0000 0068 1100  .............h..
000004b0: 0000 0000 0000 0000 0011 0000            ............

Breaking this output down against the table outlined earlier, we can see that at offset 0x329, the directory name is given. Interestingly, 4-bytes prior to this offset, we have the value 0x06, which correlates to the length (character count) of the file/directory name. As 0x06 is simply 6 in decimal, we can ascertain that the directory name is 6 characters long. The presence of this file name length byte is consistent across all files/folders contained within the AD1 samples I tested.

After the file name, we have some data interspersed with null bytes. I am not sure exactly what this data refers to, as it appeared to vary quite drastically between the samples I had. However, I gather that there must be a mechanism for AD1 extraction software to understand the structure, such as which file belongs to which directory, etc. Therefore, I speculate that this data may be related to the overall file structure, perhaps as a B-Tree, judging by the literature I have read4.

The next pieces of data should be familiar to anyone in the forensics field, we have what appear to be our MAC (Modify, Access, Change) timestamps. Interestingly, it took a while for me to ascertain which order these timestamps were stored in the AD1 file, and whether it was consistent across all the samples. After some timestamp comparisons with the original files, taking special care not to accidentally update any of them, I eventually came to the conclusion that the AD1 format stores the three timestamps per file/folder in the following order:

  1. (A) Access Time
  2. (C) Change Time
  3. (M) Modified Time

Following on from the timestamps, we come to another set of odd-looking data. At first glance, they appear to be a series of flags, hence the ‘true’ and ‘false’ strings. Therefore, I speculate that these may refer to individual file and directory attributes, such as whether a file is hidden, read-only, or protected for example. However, as with the structure data, this is a logical assumption I am making as these flags also appear to inconsistent between files and folders, despite none of the original files having used special attributes.

Now we can compare the directory data to that of a standard file, in which I am going to use the DLL_File.dll Portable Executable (PE), present within ‘Folder 2’ (F2_C3_1500F.ad1):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
00000960: 0c00 0000 444c 4c5f 4669 6c65 2e64 6c6c  ....DLL_File.dll
00000970: a005 0000 0000 0000 0100 0000 0000 0000  ................
00000980: 9007 0000 0000 0000 2a39 0000 0000 0000  ........*9......
00000990: 785e ed7d 7b7c 1cc5 917f cfec ecec 6af5  x^.}{|........j.
000009a0: 5c59 b217 b0cd fa05 422f 4b96 2ccb c636  \Y......B/K.,..6
000009b0: 9625 d916 48b6 6249 e68d 3d5a 8dac 8d57  .%..H.bI..=Z...W
000009c0: bbf2 eeca b630 0619 0847 2e98 c005 4820  .....0...G....H 
000009d0: e115 c205 0839 9200 811c 3fc0 3c43 8e24  .....9....?.<C.$
000009e0: 478e f009 b910 4c8e 70e1 b824 845f c811  G.....L.p..$._..
000009f0: f2e3 e0f7 adea 9edd 1959 364e eefe bb93  .........Y6N....
00000a00: 999a aeea aaea eaea eeea c7cc 0e3d e75c  .............=.\
00000a10: 237c 4208 03d7 471f 09f1 30ee f4b7 46dd  #|B...G...0...F.
00000a20: 8f76 9b44 66c9 897f 5f22 1e28 f8c1 bc87  .v.Df..._".(....
[ . . . ]
00003aa0: b4ae b975 7868 6983 d588 05b1 c727 b2b3  ...uxhi......'..
00003ab0: 2e9e d21e 2b17 7b1a 4e36 bd8b e7e3 1b9e  ....+.{.N6......
00003ac0: 07a1 d371 57ee 96ed 89b3 7c7a 1f30 936b  ...qW.....|z.0.k
00003ad0: d2a1 b178 db6e bcee e8e8 1bc9 66c7 562c  ...x.n......f.V,
00003ae0: 5eac c675 7d6e 5cd7 c39c c57d 3d5d 8b97  ^..u}n\....}=]..
00003af0: e09d e2c5 aa77 38da e6af 461f b557 2e76  .....w8...F..W.v
00003b00: 9439 fd78 9a42 a754 14a8 8a35 abf9 83e4  .9.x.B.T...5....
00003b10: fffb f797 7b60 527e abfb 03fe aad8 fffe  ....{`R~........
00003b20: fd4f f3c0 ff07 c697 00f2 3f39 0000 0000  .O........?9....
00003b30: 0000 0200 0000 0200 0000 0100 0000 3158  ..............1X
00003b40: 3900 0000 0000 0003 0000 0003 0000 0005  9...............
00003b50: 0000 0032 3931 3834 8239 0000 0000 0000  ...29184.9......
00003b60: 0500 0000 0700 0000 1600 0000 3230 3231  ............2021
00003b70: 3036 3032 5431 3833 3834 372e 3337 3931  0602T183847.3791
00003b80: 3737 ac39 0000 0000 0000 0500 0000 0800  77.9............
00003b90: 0000 1600 0000 3230 3231 3036 3032 5431  ......20210602T1
00003ba0: 3831 3931 302e 3031 3232 3631 d639 0000  81910.012261.9..
00003bb0: 0000 0000 0500 0000 0900 0000 1600 0000  ................
00003bc0: 3230 3139 3033 3330 5431 3831 3933 342e  20190330T181934.
00003bd0: 3439 3636 3231 ef39 0000 0000 0000 0400  496621.9........
00003be0: 0000 0d00 0000 0500 0000 6661 6c73 6508  ..........false.
00003bf0: 3a00 0000 0000 0004 0000 000e 0000 0005  :...............
00003c00: 0000 0066 616c 7365 203a 0000 0000 0000  ...false :......
00003c10: 0400 0000 1e00 0000 0400 0000 7472 7565  ............true
00003c20: 393a 0000 0000 0000 0400 0000 0210 0000  9:..............
00003c30: 0500 0000 6661 6c73 6552 3a00 0000 0000  ....falseR:.....
00003c40: 0004 0000 0003 1000 0005 0000 0066 616c  .............fal
00003c50: 7365 6b3a 0000 0000 0000 0400 0000 0410  sek:............
00003c60: 0000 0500 0000 6661 6c73 6583 3a00 0000  ......false.:...
00003c70: 0000 0004 0000 0005 1000 0004 0000 0074  ...............t
00003c80: 7275 65b7 3a00 0000 0000 0001 0000 0001  rue.:...........
00003c90: 5000 0020 0000 0030 3861 6235 6537 3831  P.. ...08ab5e781
00003ca0: 6134 3863 3366 3632 3734 6561 6264 3239  a48c3f6274eabd29
00003cb0: 3663 6433 6234 6600 0000 0000 0000 0001  6cd3b4f.........
00003cc0: 0000 0002 5000 0028 0000 0064 3339 3337  ....P..(...d3937
00003cd0: 6565 6363 3361 6261 3830 6137 3537 3166  eecc3aba80a7571f
00003ce0: 3435 6333 3730 6437 6238 3163 6462 6335  45c370d7b81cdbc5
00003cf0: 3237 6100 0000 0000 0000 0087 3c00 0000  27a.........<...
00003d00: 0000 0031 3b00 0000 0000 0000 0000 0000  ...1;...........

As demonstrated in the directory data previously dissected, we can immediately establish the length of the file name to be 0x0c (12) characters long. Then we see that the name of the file is indeed DLL_File.dll and there is some speculated structure data following this. However, we now see that there appears to be almost random-looking data, which to the untrained eye might look like encrypted or encoded data. This is actually in fact, compressed data, which comprise the raw contents of the PE file.

Which algorithm the compression is using is actually quite simple to gather when you read the signature; 0x785e. This told me that we are dealing with zlib compression. To this end, I strongly advise you to read RFC-1950, which provides the specification for the ZLIB compressed data format. It is also very important to note that this zlib signature changes depending on the level of compression specified by zlib. The table below provides an overview of the zlib signatures matched with the compression level:

COMPRESSION LEVELZLIB SIGNATURE
178 01
278 5E
378 5E
478 5E
578 5E
678 9C
778 DA
878 DA
978 DA

Indeed, this is very likely how FTK Imager determines the compression level when the examiner specifies a value between 0 and 9. For example; the AD1 samples whose compression levels were set between levels 1 and 9, the above signatures matched perfectly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# xxd F1_C1_1500F.ad1 | less
004212c0: 4a50 475f 4669 6c65 2e6a 7067 fc0e 4200  JPG_File.jpg..B.
004212d0: 0000 0000 0600 0000 0000 0000 1411 4200  ..............B.
004212e0: 0000 0000 bfd4 4200 0000 0000 9c9a 4300  ......B.......C.
004212f0: 0000 0000 d360 4400 0000 0000 8025 4500  .....`D......%E.
00421300: 0000 0000 26e7 4500 0000 0000 d961 4600  ....&.E......aF.
00421310: 0000 0000 7801 9cbc 7b40 13d7 d6ff 3d31  ....x...{@....=1

# xxd F1_C5_1500F.ad1 | less
00421290: a405 0000 0000 0000 0000 000c 0000 004a  ...............J
004212a0: 5047 5f46 696c 652e 6a70 67db 0e42 0000  PG_File.jpg..B..
004212b0: 0000 0006 0000 0000 0000 00f3 1042 0000  .............B..
004212c0: 0000 0082 ce42 0000 0000 006b 8e43 0000  .....B.....k.C..
004212d0: 0000 00c2 4e44 0000 0000 005a 0d45 0000  ....ND.....Z.E..
004212e0: 0000 00ba c645 0000 0000 0087 3c46 0000  .....E......<F..
004212f0: 0000 0078 5e9c bc7b 5853 d7d6 ffbb 6290  ...x^..{XS....b.

# xxd F1_C6_1500F.ad1 | less
00421290: 0000 0000 0000 0000 0c00 0000 4a50 475f  ............JPG_
004212a0: 4669 6c65 2e6a 7067 d80e 4200 0000 0000  File.jpg..B.....
004212b0: 0600 0000 0000 0000 f010 4200 0000 0000  ..........B.....
004212c0: 3bce 4200 0000 0000 e68d 4300 0000 0000  ;.B.......C.....
004212d0: fb4d 4400 0000 0000 530c 4500 0000 0000  .MD.....S.E.....
004212e0: 47c5 4500 0000 0000 b23a 4600 0000 0000  G.E......:F.....
004212f0: 789c 9cbc 7b58 53d7 d6ff bb62 90a0 45e2  x...{XS....b..E.

# xxd F1_C9_1500F.ad1 | less
00421290: 0000 0000 0000 0000 0c00 0000 4a50 475f  ............JPG_
004212a0: 4669 6c65 2e6a 7067 d80e 4200 0000 0000  File.jpg..B.....
004212b0: 0600 0000 0000 0000 f010 4200 0000 0000  ..........B.....
004212c0: 39ce 4200 0000 0000 e38d 4300 0000 0000  9.B.......C.....
004212d0: f54d 4400 0000 0000 4c0c 4500 0000 0000  .MD.....L.E.....
004212e0: 35c5 4500 0000 0000 933a 4600 0000 0000  5.E......:F.....
004212f0: 78da 9cbc 7b58 53d7 d6ff bb62 90a0 45e2  x...{XS....b..E.

However, what about AD1 files where a compression level of 0 is selected? According to FTK Imager, this is equivalent to ‘no’ compression, and we can verify this by taking a closer look at the one of the C0 AD1 samples:

1
2
3
4
5
6
7
8
9
# xxd F1_C0_1500F.ad1 | less
004211e0: 0000 0000 0000 0000 000c 0000 004a 5047  .............JPG
004211f0: 5f46 696c 652e 6a70 6729 0e42 0000 0000  _File.jpg).B....
00421200: 0006 0000 0000 0000 0041 1042 0000 0000  .........A.B....
00421210: 0051 1043 0000 0000 0061 1044 0000 0000  .Q.C.....a.D....
00421220: 0071 1045 0000 0000 0081 1046 0000 0000  .q.E.......F....
00421230: 0091 1047 0000 0000 00b4 b447 0000 0000  ...G.......G....
00421240: 0078 0100 fbff 0400 ffd8 ffe0 0010 4a46  .x............JF
00421250: 4946 0001 0100 0001 0001 0000 fffe 003b  IF.............;

In the output above, the zlib signature is still present at oiffset 0x421241, which correlates to ZLIB compression level 1. However, looking closer at the data, the raw contents of the JPG image file can be seen, meaning the raw file data has not actually been compressed. Reviewing the offical ZLIB documentation, it is actually possible to specify a compression level of 0 by using Z_NO_COMPRESSION5. Therefore, I assume that zlib compression level 0 must also share the same signature as compression level 1.

Interestingly, the raw zlib data associated with a given file within an AD1 sample can be trivially extracted by using the dd command. If we again take the PE file DLL_File.dll from the ‘Folder 2’ structure (F2_C3_1500F.ad1), we can easily calculate the start, end and length of the compressed zlib data manually. For instance, I knew the zlib data for the PE file within the AD1 sample started at offset 0x990, so 2448-bytes into the sample. I also knew that the zlib data ended when we hit a sequence of NULL bytes, which occurs at offset 0x3b2b, meaning we can calculate the length of the data to be 12700-bytes long (includes one additional byte to account for the final, omitted byte):

1
2
3
# dd if=F2_C3_1500F.ad1 bs=1 skip=2448 count=12700 status=none > zlib.data
# file zlib.data 
zlib.data: zlib compressed data

Substituting in the previously calculated values into the dd command, we can extract the raw zlib data comprising the PE file and verify its contents using the file command. From here, there are several ways we can de-compress the zlib data using Linux commands, in this instance, the openssl command will suffice:

1
2
3
# openssl zlib -d -in zlib.data > Extracted_DLL.dll
# file Extracted_DLL.dll 
Extracted_DLL.dll: PE32 executable (DLL) (console) Intel 80386 Mono/.Net assembly, for MS Windows

It should be noted that this is obviously allocating an entirely new file from the AD1 container and thus, the new PE file will contain none of the original metadata. As a quick side-note; in the data following the compressed zlib data within in the AD1 file, there is a small section of the data which reports the original file size. This can be seen in the AD1 hexadecimal data for the PE file at offset 0x3b53, wherein we can see that the ASCII reports the size as 29184 bytes. To verify this, we can use the du -b command on Linux to view the size, in bytes, of our extracted PE file:

1
2
# du -b Extracted_DLL.dll 
29184	Extracted_DLL.dll

Finally, past the timestamp data and the speculated file attribute data sections, we reach what appear to be hash values. It would make sense for a forensic software such as FTK Imager to store this information to verify data integrity. Looking at the AD1 hexadecimal data for the PE file one last time, we can see that there is an MD5 and SHA-1 hash value at offsets 0x3c98 and 0x3ccb respectively:

1
2
3
4
5
6
7
00003c90: 5000 0020 0000 0030 3861 6235 6537 3831  P.. ...08ab5e781
00003ca0: 6134 3863 3366 3632 3734 6561 6264 3239  a48c3f6274eabd29
00003cb0: 3663 6433 6234 6600 0000 0000 0000 0001  6cd3b4f.........
00003cc0: 0000 0002 5000 0028 0000 0064 3339 3337  ....P..(...d3937
00003cd0: 6565 6363 3361 6261 3830 6137 3537 3166  eecc3aba80a7571f
00003ce0: 3435 6333 3730 6437 6238 3163 6462 6335  45c370d7b81cdbc5
00003cf0: 3237 6100 0000 0000 0000 0087 3c00 0000  27a.........<...
  • MD5: 08ab5e781a48c3f6274eabd296cd3b4f
  • SHA-1: d3937eecc3aba80a7571f45c370d7b81cdbc527a

As with the file size, this is very easy to verify by using the md5sum and sha1sum tools on Linux against the extracted PE file to see whether they match:

1
2
3
4
# md5sum Extracted_DLL.dll 
08ab5e781a48c3f6274eabd296cd3b4f  Extracted_DLL.dll
# sha1sum Extracted_DLL.dll 
d3937eecc3aba80a7571f45c370d7b81cdbc527a  Extracted_DLL.dll

Hence, we can see that the hash values match, proving that the zlib extraction and decompression methods used previously did not alter the file contents in any way. At this point, the data simply repeats starting from the next file or directory name until it reaches the end of the AD1 file, which we will call the ‘footer’ section.

Many file formats will often have a ‘footer’ (sometimes called a ‘trailer’) contained at the end of the file data which can be used for a variety of reasons. In most cases however, such a section is simply used to denote the end of the file data. This is useful for file carving tools, which will use header and footer signature values to ascertain where specific file data begins and ends. In this regard, AD1 files are no different, and contain a section at the end of the container of varying size.

Interestingly, when I tested my AD1 samples by looking at the hexadecimal data present at the very end of the file, I noticed that while some samples had commonalities, there was no consistent footer ‘signature’ across all of the AD1 samples. Additionally, the size of this footer data section appeared to be inconsistent in size. However, the footer section always appeared to start when the following data signature is reached:

1
4154545247554944 ATTRGUID

For instance, this can be seen in the ‘Folder 1’ sample F1_C5_1500F.ad1:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
004682e0: 8046 0000 0000 0004 0000 0005 1000 0004  .F..............
004682f0: 0000 0074 7275 652b 8146 0000 0000 0001  ...true+.F......
00468300: 0000 0001 5000 0020 0000 0036 6166 3461  ....P.. ...6af4a
00468310: 3832 3639 3361 3430 3362 3064 3061 6664  82693a403b0d0afd
00468320: 6531 3639 3732 3436 3666 3500 0000 0000  e16972466f5.....
00468330: 0000 0001 0000 0002 5000 0028 0000 0031  ........P..(...1
00468340: 6162 3861 3364 3063 6632 3263 6465 3233  ab8a3d0cf22cde23
00468350: 3137 3362 3662 3431 3532 3133 3737 6330  173b6b41521377c0
00468360: 6664 6265 6561 3841 5454 5247 5549 4400  fdbeea8ATTRGUID.
00468370: 0000 000f 0000 0002 0000 00f4 73d8 8ab0  ............s...
00468380: c06a 47ba 1a36 81c1 816b 0b03 0000 008a  .jG..6...k......
00468390: 8408 354c 7621 4797 9a87 0e1c 1a33 e007  ..5Lv!G......3..
004683a0: 0000 0037 a6cc e543 0457 4fbe ca80 36e4  ...7...C.WO...6.
004683b0: f785 7408 0000 00fc 0d10 f335 c6ef 40ab  ..t........5..@.
004683c0: 7b2e a4f1 1d89 4709 0000 00c8 a66c adef  {.....G......l..
004683d0: db57 429c 27ad a44e 0fe8 b70d 0000 00b6  .WB.'..N........
004683e0: 8132 8cde c3b0 4ca4 5360 74b6 89b9 090e  .2....L.S`t.....
004683f0: 0000 007e 4a0f 065f eba4 4b81 3f08 8b68  ...~J.._..K.?..h
00468400: 81d6 7f1e 0000 0064 ac64 ce64 3a7e 4f85  .......d.d.d:~O.
00468410: 3281 2896 8558 0c02 1000 0045 5504 429e  2.(..X.....EU.B.
00468420: ad1f 4281 e3f7 89b5 4179 8e03 1000 00f7  ..B.....Ay......
00468430: f66a a13d aa14 4d96 1001 73dc 81a4 3604  .j.=..M...s...6.
00468440: 1000 005e 10cb e935 62ab 45aa 652e 2a60  ...^...5b.E.e.*`
00468450: 5e95 b405 1000 0065 795c 1d07 1462 43bd  ^......ey\...bC.
00468460: edfe 175a 4d71 4001 5000 000a 33e2 6be2  ...ZMq@.P...3.k.
00468470: 15ea 4db6 3e21 98e8 1eab 7502 5000 0018  ..M.>!....u.P...
00468480: ffe8 4027 2324 4c93 58db ea9f a4dd 5c02  ..@'#$L.X.....\.
00468490: 0001 0065 9dab 5ef2 a3a1 44aa dd15 5f55  ...e..^...D..._U
004684a0: 06fd 164c 4f43 5347 5549 4400 0000 0003  ...LOCSGUID.....
004684b0: 0000 0001 0000 003e a0fa a262 498a 44ac  .......>...bI.D.
004684c0: 5889 8368 016f c702 0000 005e 6ac1 6efe  X..h.o.....^j.n.
004684d0: cef8 488f 37dc 2342 d969 db03 0000 0042  ..H.7.#B.i.....B
004684e0: f2cb 6423 6670 4cb5 63af b179 c5c8 36    ..d#fpL.c..y..6

From the output above, the SHA-1 value of the final file within the AD1 structure can be seen immediately prior to the ATTRGUID signature, after which it continues until the end of the AD1 file is reached. The next step from here would be to dissect this data section further to ascertain whether it held any meaningful information.

However, after reviewing this data across various AD1 samples, it quickly became apparent that this data was also very inconsistent across them, in both the contents and the length. Unfortunately, the footer structure also does not befit a file appended to the AD1 container, as the file command reports:

1
2
3
# dd if=F2_C9_1500F.ad1 bs=1 skip=34751 status=none > F2_C9_1500F.ad1.footer
# file F2_C9_1500F.ad1.footer 
footer: data

Summary of Findings

AD1 Header Data

The table presented below provides an overview of the data contained within the AD1 header sections of the samples I tested:

OFFSETLENGTHHEXADECIMAL VALUEDESCRIPTIONLOGICAL IMAGECUSTOM CONTENT
0x015-bytes41445345474d454e54454446494cSignature: ADSEGMENTEDFILEYesYes
0x225-bytesVariableFragmentation LevelYesYes
0x20014-bytes41444c4f474943414c494d414745Signature: ADLOGICALIMAGEYesYes
0x2302-bytes4144AD signatureYesYes
0x25c30-bytes437573746f6d20436f6e74656e7420496d616765285b4d756c74695d297dCustom Content SignatureNoYes
0x25cN/AN/AStart of file dataYesNo
0x2a9N/AN/AStart of file dataNoYes

AD1 File/Folder Data Structure(s)

The table presented below provides a high-level overview of the individual file/folder structure contained within AD1 files:

SEQUENCEDATAFILEDIRECTORY
1NameYesYes
2? Structure Data ?YesYes
3Compressed DataYesNo
4Original SizeYesNo
5ACM TimestampsYesYes
6? Attribute Data ?YesYes
7Hash ValuesYesNo

AD1 Compression Signatures

The table presented below provides an overview of the hexadecimal zlib signatures associated with the corresponding level of AD1 compression selected by the examiner:

AD1 COMPRESSION LEVELZLIB SIGNATURE
00x7801
10x7801
20x785e
30x785e
40x785e
50x785e
60x789c
70x78Da
80x78Da
90x78Da

The table below simply provides the hexadecimal signature used in AD1 files to denote the start of the footer data section:

HEXADECIMAL VALUEDESCRIPTION
4154545247554944Start of footer section signature: ATTRGUID

Other Data

In addition to the above tables, please find below a list of other findings relating to the AD1 file format which may prove useful:

  • AD1 files can be created using one of two methods; Logical Image or Custom Content
  • The only discernable difference in data between these two methods resides in the AD1 header section
  • AD1 containers store raw file content using zlib compression (when compression is selected)
  • The character length of the each file/folder name is stored in the data preceeding each file/folder name
  • The AD1 file size is stored in the header section (stored value appears to omit header/footer data, requires further testing)

Concluding Statements

This article was published with the intention of sharing the findings of my personal research and expermentation into the AccessData AD1 file format. Under normal circumstances, I would apply the scientific method as my primary research methodology, as seen in my other articles, however this research was much more speculative in nature. This was due to a number of reasons; the foremost being that I am essentially attempting to dissect an undocumented, proprietary forensic data format. The secondary reason being that there are many variations of AD1 files that can be created across many different versions of FTK Imager, and unfortunately, I did not have time to dedicate to exploring all of these potential possibilities.

Given the amount of variables associated with the dissection of the AD1 file format, I decided to focus my experiments on those created from just one (newer) version of FTK Imager. All of my experiementation and testing included AD1 samples taken from various online resources, as well as the control samples I generated myself. The biggest problem I faced during the testing process was the copious amounts of mismatching data between the samples, which made it difficult to cross-reference data sets. As such, I made many assumptions about the data based on the results I had and the tests I was able to perform within the timeframe available to me.

Reviewing the information I produced from my testing, I believe it should be hypothetically possible to build a Linux-compatible command-line tool to extract the file data (with their original metadata) from AD1 containers and present it in a logical structure for an analyst to work on. I know this is already possible using Windows-based techniques that were outlined earlier in the article.

Overall, I hope that the results of the research I conducted here will provide a useful foundation for those looking to explore the AD1 file format further, or for those who may just want to understand how AD1 containers store and compress the file data. For this reason, I decided to include an executive summary of the primary findings resulting from my experiments to serve as a quick reference for those who may require it.

If you have any recommendations or questions about the topics mentioned in this article, please contact me on Twitter.

– Mairi

References

  1. Lefton, S. (2016). Native Production—Advantages to Producing E-Discovery Natively [Accessed 2021-06-02]. 

  2. AccessData Group, Inc. (2016). Imager User Guide [Accessed 2021-05-28]. 

  3. Carrier, B. (2005). File System Forensic Analysis. Addison-Wesley:Boston, MA. 

  4. Koruga, P. & Baca, M. (2010). Analysis of B-tree Data Structure and its Usage in Computer Forensics. University of Zagreb:Croatia. 

  5. Gailly, J. & Adler, M. (2017). zlib 1.2.11 Manual [Accessed 2021-06-01]. 

This post is licensed under CC BY 4.0 by the author.