“What does metadata mean?” is a question we get every day and which is answered by looking at electronic documents properties and explaining different types of metadata. We wish to thank United States Magistrate Judge Frank Maas for sharing his explanations in Adriana Aguilar v. Immigration and Customs Enforcement Division of the United States Department of Homeland Security (PDF):
1. Types of Metadata
Metadata, frequently referred to as “data about data,” is electronically-stored evidence that describes the “history, tracking, or management of an electronic document.” It includes the “hidden text, formatting codes, formulae, and other information associated” with an electronic document. The Sedona Principles-Second Edition: Best Practices Recommendations and Principles for Addressing Electronic Document Production Cmt. 12a (Sedona Conference Working Group Series 2007), http://www.thesedonaconference.org/content/miscFiles/TSC_PRINCP_2nd_ed_607.pdf (“Sedona Principles 2d” ); see also Autotech Techs. Ltd. P’Ship v. Automationdirect.com, Inc., 248 F.R.D. 556, 557 n. 1 (N.D.Ill.2008) (Metadata includes “all of the contextual, processing, and use information needed to identify and certify the scope, authenticity, and integrity of active or archival electronic information or records”). Although metadata often is lumped into one generic category, there are at least several distinct types, including substantive (or application) metadata, system metadata, and embedded metadata. Sedona Principles 2d Cmt. 12a; see United States District Court for the District of Maryland, Suggested Protocol for Discovery of Electronically Stored Information 25-28, http://www.mdd.uscourts.gov/news/news/ESIProtocol.pdf (“Md.Protocol” ).a. Substantive Metadata
Substantive metadata, also known as application metadata, is “created as a function of the application software used to create the document or file” and reflects substantive changes made by the user. Sedona Principles 2d Cmt. 12a; Md. Protocol 26. This category of metadata reflects modifications to a document, such as prior edits or editorial comments, and includes data that instructs the computer how to display the fonts and spacing in a document. Sedona Principles 2d Cmt. 12a. Substantive metadata is embedded in the document it describes and remains with the document when it is moved or copied. Id. A working group in the District of Maryland has concluded that substantive metadata “need not be routinely produced” unless the requesting party shows good cause. Md. Protocol 26.b. System Metadata
System metadata “reflects information created by the user or by the organization’s information management system.” Sedona Principles 2d Cmt. 12a. This data may not be embedded within the file it describes, but can usually be easily retrieved from whatever operating system is in use. See id. Examples of system metadata include data concerning “the author, date and time of creation, and the date a document was modified.” Md. Protocol 26. Courts have commented that most system (and substantive) metadata lacks evidentiary value because it is not relevant. See Mich. First Credit Union v. Cumis Ins. Soc’y, Inc., No. Civ. 05-74423, 2007 WL 4098213, at *2 (E.D.Mich. Nov.16, 2007); Ky. Speedway, LLC v. Nat’l Assoc. of Stock Car Auto Racing, No. Civ. 05-138, 2006 WL 5097354, at *8 (E.D.Ky. Dec.18, 2006); Wyeth v. Impax Labs., Inc., 248 F.R.D. 169, 170 (D.Del.2006). System metadata is relevant, however, if the authenticity of a document is questioned or if establishing “who received what information and when” is important to the claims or defenses of a party. See Hagenbuch v. 3B6 Sistemi Elettronici Industriali S.R.L., No. 04 Civ. 3109, 2006 WL 665005, at *3 (N.D.Ill. Mar.8, 2006). This type of metadata also makes electronic documents more functional because it significantly improves a party’s ability to access, search, and sort large numbers of documents efficiently. Sedona Principles 2d Cmt. 12a.c. Embedded Metadata
Embedded metadata consists of “text, numbers, content, data, or other information *355 that is directly or indirectly inputted into a [n]ative [f]ile by a user and which is not typically visible to the user viewing the output display” of the native file. Md. Protocol 27. Examples include spreadsheet formulas, hidden columns, externally or internally linked files (such as sound files), hyperlinks, references and fields, and database information. Id. This type of metadata is often crucial to understanding an electronic document. For instance, a complicated spreadsheet may be difficult to comprehend without the ability to view the formulas underlying the output in each cell. For this reason, the District of Maryland working group concluded that embedded metadata is “generally discoverable” and “should be produced as a matter of course.” Id. at 27-28.
