2016년 8월 11일 목요일
ESEDBForensic- Recovery Tool of deleted records and tables from ESE Database
In DFRWS 2016 USA, I was Speaker on this topic :
Recovery method of deleted records and tables from ESE Database
In my paper, there is the following phrase :
we developed a tool to extract normal records in the ESE database and recover deleted records, without using the database API.
(O M I T)
EDBForensic.py <Input Path> <Output Path>
Now, I am converting this program from CLI to GUI.
This program will be released later this month (about Sep. 2016)
As soon as possible, I will update it. Thanks!
2016년 8월 5일 금요일
Basic structure of ESE Database
1. ESE database
The Extensible Storage Engine (ESE), also known as JET Blue is a ISAM data storage technology developed by Microsoft. The ESE is a core component in Branch Cache, Active Directory, and Microsoft Server.
The purpose of the ESE is to save and manage data through indexing and sequential access. Numerous Windows components such as Desktop Search, Active Directory use the ESE. It provides a collision recovery mechanism in order to maintain data consistency in times of system event occurrence. The ESE is suitable for server applications because it supports realtime transactions. The ESE cache guarantees high data access performance. Moreover, the ESE is light enough to be fit for auxiliary applications.
The ESE runtime (ESENT.DLL) exists in all Windows releases after the x32 edition of Microsoft Exchange 2003, and the x64 edition of Windows Server 2003, Windows XP, and Windows 2010. Microsoft Exchange has supported 32 bits until Exchange 2003 version, and now Exchange 2007 supports 64 bits.
1.1 Database
A database is an object which groups data physically, logically. The ESE database seems to be a single file in Windows, but internally it is a set of pages whose sizes are 2, 4, 8, 16, 32 KB and managed in a B-tree structure. (16 and 32 KB papges are used only in Windows 7, Exchange 2010, and higher versions). These pages include a database, data itself, an index for managing data orders, and meta data for describing data included in other information. A database can include up to 8 KB-sized pages, 16 TB data, and 232 pages.
1.2 Table
A table is a homogeneous set of records. Each record has the same column set. Each table is identified by table names, and its range is bound by database. The size of disk space allotted to tables in a database is determined by parameters given when the tables are created through the CreateTable operation. Tables are increased automatically as data is written. Tables have at least one index. In a record data, there is at least one clustered index. If a cluster index is not defined in an application, an artificial index is created according to the time order of record insertion, to cluster and arrange records. Indexes are defined to sustain the data order, and enables data access by index order and column value. ESE’s clustered index is important that an index key should be unique. A clustered, or non-clustered index is indicated by using B-tree.
If insertion and update commands exceed a page, then the page is divided. A new page is allotted between the two pages which are previously adjacent in a logical sense. The ESE features on-line compression to re-compress data. If frequent updates are excepted in a table, a space for future data insertion is reserved by designating a proper page density when creating a table or an index, which helps to avoid or delay a split command.
The ESE database is composed of groups called instances. Most applications use a single instance. But all application can use several instances. Through these instances, more than one database can be connected with recovery log files. Currently, up to 6 user databases can be attached to the ESE instance. Each process that ESE uses can have up to 1024 ESE instances.
A database can be separated from the ESE instances under execution, and later attached to the same or other instances under execution. During separation, the database is copied, using Standard Windows utility. It cannot be copied when opened independently and used in the activated state. The database can be operated in any devices that physically support Windows I/O commands for direct address designation.
1.3 Record and Column
A record is related to the set of column values. Records are inserted and updated through Update command, and deleted through Delete command. Columns are configured and retrieved through SetColumns and RetrieveColumns commands. The maximum size of a record 8,110 bytes, except for Long value columns, in case that the page size 8 KB. LongText and LongBinary column types have no size bound. Records can contain data bigger than the size of database when the data is saved in long value columns. Only 9-byte internal data is required when long value reference is stored in records, and this long value can support up to 2 GB.
Generally, record values fit for the same column set are recorded as records. It is also possible to define a table that has multiple columns and include non-null-value columns in the ESE. In this respect, a table can be a set of heterogeneous records.
The ESE supports a broad range of columns from 1 bit to 2 GB. The type of columns determine characteristics including index order so that it is vital to designate appropriate columns. The ESE supports types presented in Table 1.
[Table] Column type and description
The column type of a long text and a long binary is a big-sized binary entity. This entity is saved in B-tree, and indexed by a column ID and a byte offset. The ESE supports append, byte-ranged-overwrite, and size designation. Also, ESE can save a single instance and thereby refer to multiple records for a binary entity. The maximum size of Long Text and Long Binary columns is 2 GB.
The ESE logging and recovery functions guarantee data integrity and consistency against system crash. Logging is a process that records update information of database. The structure of log files is very safe against system crash. If a system crashes, The log files are used to turn database into consistent state. They can also protect data from media error. ESE supports on-line backup function to copy a database and log files without influencing database operations. Through this, a database can process query and update while backup is done.
Most of the ESE database structure was summarized as a document by Metz. ESE database files are comprised of multiple pages except for the database header, and manage pages in B-Tree structure.
There are multiple tables inside ESE database, and table information is managed by a catalog table called MSysObject. Certain tables include sub-tables called LV, in order to save big-sized data. Tables have their own identification numbers, known as Father data page (FDP). This is specified in all page headers in tables. The sub-tables called LV also use distinct FDP numbers, and are managed in the same way as the superordinate table.
In the database header, there is an unique signature in hexadecimal, and database state items and page size information are saved. The database header is important information that determines ESE database analysis methods, and combined with page numbers helps to find a page offset. The flowing figures show the database header format in hexadecimal.
MSysObject table is situated in the fourth page and explains about all the tables including itself. This information comprises ObjidTable, type, ID, ColtypOrPgnoFDP, space usage, name, etc. The following figure demonstrates the location and the items of MSysObject table. ObjidTable is a FDP number for each table. Interpretation methods for other fields differ depending on Type field values. The following table shows the kind of Types and the descriptions for each Type. Fields explained below are Type-dependent. The next table describes how to interpret records, depending on Type fields.
[Table] Type in MSysObject and description
Name
|
Description
|
Bit
|
ternary value (NULL, 0, or 1)
|
Unsigned Byte
|
1-byte unsigned integer
|
Short
|
2-byte signed integer
|
Unsigned Short
|
2-byte unsigned integer
|
Long
|
4-byte signed integer
|
Unsigned Long
|
4-byte unsigned integer
|
Long Long
|
8-byte signed integer
|
Unsigned Long Long
|
8-byte unsigned integer
|
Currency
|
8-byte signed integer
|
IEEE Single
|
4-byte floating-point number
|
IEEE Double
|
8-byte floating-point number
|
DateTime
|
8-byte date-time
(integral date, fractional time) |
GUID
|
16-byte unique identifier
|
Binary
|
Binary string, length
|
Text
|
ANSI or Unicode string, length
|
Long Binary
|
Large binary string, length
|
Long Text
|
Large ANSI or Unicode string, length
|
1.4 Column Types
Each ESE table can be defined with up to 127 fixed columns, 129 variable columns, and 64,993 tagged columns.
1.4.1 Fixed Columns
A fixed column is a column that has the same size of space in each record irrespective of values. The fixed column has 1 bit space for expressing a null value, and the size of each record is allotted according to what is defined in the column.
1.4.2 Variable Columns
A variable column is a column that saves variable values corresponding to values of a certain column. Variable columns write a null value and a size value in 2 bytes, and save variable values.
1.4.3 Tagged Columns
A tagged column does not occupy any space if not configured in a record. Tagged columns can be a single value or multiple values. If a tagged column is set in a record, each instance takes up 4-byte space.
1.5 Method for big-sized binary entity
The column type of a long text and a long binary is a big-sized binary entity. This entity is saved in B-tree, and indexed by a column ID and a byte offset. The ESE supports append, byte-ranged-overwrite, and size designation. Also, ESE can save a single instance and thereby refer to multiple records for a binary entity. The maximum size of Long Text and Long Binary columns is 2 GB.
1.6 Logging and Crash Recovery
The ESE logging and recovery functions guarantee data integrity and consistency against system crash. Logging is a process that records update information of database. The structure of log files is very safe against system crash. If a system crashes, The log files are used to turn database into consistent state. They can also protect data from media error. ESE supports on-line backup function to copy a database and log files without influencing database operations. Through this, a database can process query and update while backup is done.
2. ESE Database Structure Analysis
Most of the ESE database structure was summarized as a document by Metz. ESE database files are comprised of multiple pages except for the database header, and manage pages in B-Tree structure.
There are multiple tables inside ESE database, and table information is managed by a catalog table called MSysObject. Certain tables include sub-tables called LV, in order to save big-sized data. Tables have their own identification numbers, known as Father data page (FDP). This is specified in all page headers in tables. The sub-tables called LV also use distinct FDP numbers, and are managed in the same way as the superordinate table.
2.1 Database Header
In the database header, there is an unique signature in hexadecimal, and database state items and page size information are saved. The database header is important information that determines ESE database analysis methods, and combined with page numbers helps to find a page offset. The flowing figures show the database header format in hexadecimal.
2.2 MSysObject Table
MSysObject table is situated in the fourth page and explains about all the tables including itself. This information comprises ObjidTable, type, ID, ColtypOrPgnoFDP, space usage, name, etc. The following figure demonstrates the location and the items of MSysObject table. ObjidTable is a FDP number for each table. Interpretation methods for other fields differ depending on Type field values. The following table shows the kind of Types and the descriptions for each Type. Fields explained below are Type-dependent. The next table describes how to interpret records, depending on Type fields.
[Figure] MSysObject Catalog position and name
[Table] How to analyze MSysObject table
Type
|
Id
|
ColtypOrPgnoFDP
|
Space Usage
|
Name
|
1
|
FDP id
|
FDP number
|
the number of page
|
Table name
|
2
|
Column id
|
Column type
|
Column size
|
Column name
|
4
|
Lv FDP id
|
Lv FDP number
|
the number of page
|
LV
|
[Table] Type in MSysObject and description
Type
|
Description
|
1
|
table
|
2
|
column
|
3
|
index
|
4
|
long value(Lv)
|
5
|
call back
|
ID means an identification number for Types. When the type value is 1, this is the same as ObjidTable. When the type value is 2, this is the number of a column and through this ID range, data type can be identified. The following table describes data types in accordance with ID ranges when Type is 2. When the type value is 4, this is a FDP number of a long value page.
Generally, ColtypOrPgnoFDP refers to a Father Data Page (FDP) number, through which Root page number can be known. However, exceptionally when the type value is 2, this means a column type.
[Table] Range of ID related data type
Space Usage is the number of pages, generally. Exceptionally, when the type value is 2, this signifies the number of byte of a column.
Name is the name of Type. When the type value is 1, this value represents a name of a table. When the type value is 2, this means a column name.
A page is a logical unit used to save and mange records in the ESE database, and composed of header, data, and tag areas. The following figure demonstrates the structure of a page. In the header, there is an item, expressed as Available Page Tag in the figure, that represents the number of tags, and this value corresponds to the number of records. A tag exists at the end of a page and increases in a reverse order. This value has the offset and the size of records, and 2 bytes are allotted for each. A record exists at the offset from the end of the page header, and its size is the same as Record Size. The first value in the tag area is not a tag that points to a real record, so analysis requires special attention.
Page types have been known so far, such as data, branch, empty, space tree, index, long value, and pages can be distinguished by page flag values. A record storage method is page-type dependent, and can be identified through analyzing data, branch, and long value pages. Data pages refers to a page in which real data of a table is recorded. A data page is in B-Tree structure and consists of a root page and a leaf page. A root page is the top page in the B-Tree structure, and the page number is represented as ColtypOrPgnoFDP when the type value is 1 in the catalog table. A leaf page is the bottom page in B-Tree structure. Leaf pages are structured as double-linked list, and in the header are previous and next page number. A branch page is a page where sub-page numbers are recorded in B-Tree structure. A long value page refers to a page used to save big data in the LV table.
A page header contains overall information on a page. The size of a header is 0x50h when the page size is more than 16 KB, and otherwise 0x28h. The next figure shows the page header format in hexadecimal. The page flags item in the page header refers to the kind of pages. Table 5 is a list of flags used to represent the kind of pages.
Generally, ColtypOrPgnoFDP refers to a Father Data Page (FDP) number, through which Root page number can be known. However, exceptionally when the type value is 2, this means a column type.
[Table] Range of ID related data type
Range of Id
|
Data type
|
0x0001 - 0x007F
|
Fixed size item types
|
0x0080 - 0x00FF
|
Variable size item types
|
0x0100 - 0xFFFF
|
tagged item types
|
Space Usage is the number of pages, generally. Exceptionally, when the type value is 2, this signifies the number of byte of a column.
Name is the name of Type. When the type value is 1, this value represents a name of a table. When the type value is 2, this means a column name.
2.3 Page
A page is a logical unit used to save and mange records in the ESE database, and composed of header, data, and tag areas. The following figure demonstrates the structure of a page. In the header, there is an item, expressed as Available Page Tag in the figure, that represents the number of tags, and this value corresponds to the number of records. A tag exists at the end of a page and increases in a reverse order. This value has the offset and the size of records, and 2 bytes are allotted for each. A record exists at the offset from the end of the page header, and its size is the same as Record Size. The first value in the tag area is not a tag that points to a real record, so analysis requires special attention.
Page types have been known so far, such as data, branch, empty, space tree, index, long value, and pages can be distinguished by page flag values. A record storage method is page-type dependent, and can be identified through analyzing data, branch, and long value pages. Data pages refers to a page in which real data of a table is recorded. A data page is in B-Tree structure and consists of a root page and a leaf page. A root page is the top page in the B-Tree structure, and the page number is represented as ColtypOrPgnoFDP when the type value is 1 in the catalog table. A leaf page is the bottom page in B-Tree structure. Leaf pages are structured as double-linked list, and in the header are previous and next page number. A branch page is a page where sub-page numbers are recorded in B-Tree structure. A long value page refers to a page used to save big data in the LV table.
A page header contains overall information on a page. The size of a header is 0x50h when the page size is more than 16 KB, and otherwise 0x28h. The next figure shows the page header format in hexadecimal. The page flags item in the page header refers to the kind of pages. Table 5 is a list of flags used to represent the kind of pages.
[Figure] Page Structure
[Figure] Page Header format
[Table] Page flags and Description
Flag
|
Description
|
0x00000001
|
Root page
|
0x00000002
|
Leaf page
|
0x00000004
|
Branch page
|
0x00000008
|
Empty page
|
0x00000020
|
Space tree page
|
0x00000040
|
Index page
|
0x00000080
|
Long value page
|
2.4 Page Tag Structure
Generally, the tag area in a page records the offset and the size of records. The following figure is a picture that presents the structure of a single tag existing in the tag area of a page. A page tag flag exists in the tag when page size is less than 16 KB, and otherwise in the record header. Page tag flag is categorized into 3 kinds. The following table is a table for page tag flag values and descriptions. The values determine parsing methods.
The number of page tags exist in the page header and are recorded in a reverse order from the end of pages. The first tag is used, unrelated to pointing to a record, and the usage differs depending on the kind of pages.
[Figure] Page tag structure
Flag
|
Value
|
Description
|
V
|
0x20
|
Value contains variable sized data types
|
D
|
0x40
|
Value is no longer used
|
C
|
0x80
|
value contains a common page key size
|
2016년 8월 2일 화요일
IE10Analyzer
IE10Analyzer
This tool can parse normal records and recover deleted records in WebCacheV01.dat.
WebCacheV01.dat is used in Internet Explorer 10 or 11 and Edge Browser.
Download Program (Free)
(Version : 1.0.0.6 Updated, Feb 18, 2020)
(MD5 : 0550E88E71E7C473EE0E733AB19C72AD)
Repository: https://github.com/moaistory/IE10Analyzer
IE10Analyzer Advantages :
- IE10Analyzer can recovery deleted records.
- IE10Analyzer can view contents about InPrivate Browsing.
- IE10Analyzer can extract and analyze WebCacheV01.dat from live system.
- Regardless of status of the file, IE10Analyzer can parse any file. (Dirty status is OK)
- IE10Analyzer shows more information than the other tools .
(Download file information, HTTP Response Header, Web Page Title)
- IE10Analyzer can convert from ESE database to another form : CSV, sqlite
- IE10Analyzer can apply to UTC time.
2016년 7월 31일 일요일
Internet Explorer 10, 11, Microsoft Edge Forensic Tool
IE10Analyzer Download Link : Click this
1. Introduction
Since 1994, a database engine developed by Microsoft, called Extensible Storage Engine (ESE), or JET Blue, has been used mainly in web browsers (e.g. Internet Explorer, and Microsoft Edge)
Chivers and Hargreaves found through analyzing Windows Search that deleted records remain in the database because of B-Tree [1]. Using the same method, Chivers carved deleted records in the WebCacheV01.dat file which is used in Internet Explorer version 10 or above, analyzed recovered records, and then identified bit masks that can distinguish Private browsing from type items [2]. Furthermore, Chivers researched in which situation it is possible to recover Private browsing. These two studies of Chivers all analyzed records by using a tool operating based on the database API. This tool needs to switch the database in the dirty state to the clean state, This switch may remove most recent Private browsing records that remain only in the dirty state. Also, the tool cannot recover values saved in long value pages, and the final items of records.
Results recovered by using ESE carve
Results recovered by using IE10Anlayzer
There are two reasons that it is possible to recover deleted records in the ESE database file. Firstly, if the Data page or the Long value page are turned into the Branch page, the pre-existing data remains in the Branch page. Secondly, if records are deleted, the tag are and the data are not deleted, but rather the number of record and the kind of pages change only.
Previously, I made a tool which is able to recover deleted records from ESE database. But this tool has some constraints to analyze IE 10, 11 and Microsoft Edge. To analyze IE history, the following functions are required.
1) extracting strings from ResponseHeaders fields
- HTTP Response Header
- Download information
- Web page title
2) url decoding
3) difference between private browsing and normal browsing
So, I additionally make a program which is only used to analyze IE10, Microsoft Edge. The program name is IE10Analyzer and .net 4.0 version must be installed .
2. Internet Explorer 10, 11, Microsoft Edge Forensic on Windows 10
(Before I write this post, I have used Internet explorer and Edge.)
1) Clear browsing data to be exact.
2) Search for 'Microsoft Edge' and 'notepad' in Google.
3) Download notepad++.exe
4) Run IE10Analyzer
5) Start the analysis and Set UTC Time
6) Confirm the result
- web page title (remains!)
- download information (remains!)
- HTTP response header (remains!)
- Existing Internet Explorer data (remains!)
7) Clear browsing data on Internet Explorer
8) Start InPrivate Browsing
9) Confirm the result about recovered data
10) Confirm the result about private browsing
[1] H. Chivers and C. Hargreaves, “Forensic data recovery from the windows search database,” Digital Investigation, vol. 7, no. 3-4, pp. 114-126, Apr. 2011.
[2] H. Chivers, “Private browsing: A window of forensic opportunity,” Digital Investigation, vol. 11, no. 1, pp. 20-29, Mar. 2014.