61
40
The criterion,
³Will the form produced function in an e
-
mail client?´
enables parties to explore a
broad range of functional native and near-native forms, not just PSTs. Such forms retain
essential features like Fielded Data, allowing users to reliably sort messages by date, sender,
recipients and subject, as well asMessage IDs, supporting the threading of messages into
coherent conversations. Functional forms supply the UTC Offset Data within e-mails that allows
messages originating from different time zones and using different Daylight Savings Time
settings to be normalized across an acc
urate timeline. Forms that Function don’t disrupt
the Family Relationships between messages and attachments. Forms that Function
are inherently electronically searchable .
Best of all, producing Forms that Function means that all parties receive data in a form that
anyone can use in any way they choose, visiting the costs of converting to alternate forms on the
parties who want those alternate forms and not saddling parties with forms so degraded that
they are functionally fractured and broken.
If you are
a requesting party, don’t be bamboozled by an alphabet soup of file extensions when it
comes to e-mail production (PST, OST, MSG, EML, DBX, NSF, MHTML, TIFF, PDF, RTF, TXT,
DAT, XML).
Instead, tell the other side, “
I want Forms that Function. If it can be imported into
Microsoft Outlook and work, that form will be fine by me.
”
If the other side says, “W
e will pull all that information out of the messages and give it to you in a
load file
,” say,
³No thanks, leave it where it lays, and give it to me in a F
orm that Functions!
“
Gmail is a giant database in a Google data center someplace (or in many places). Who can
guess what the native file format for cloud-based Gmail might be?
Mere mortals don’t get to
peek at the guts of Google. But it little matters because, even if we could name the native file
format, we
can’t obtain that format, nor can
we faithfully replicate its functionality locally.
16
Since we
can’t get “true” native, how can
we otherwise mirror the completeness and functionality
of native Gmail? When we
get down to the lick log, a litigant doesn’t seek native forms for grins.
A litigant seeks native forms to secure the unique benefits native brings, principally functionality
and completeness.
There are a range of options for preserving a substantial measure of the functionality and
completeness of Gmail. One would be to produce in Gmail.
HUH?!?!
16
It was once possible to create complete, offline replications of Gmail using a technology called Gears;
however, Goo
gle discontinued support of Gears some time ago. Gears’ successor, called “Gmail Offline
for Chrome,” limits its offline collection to just a month’s worth of Gmail, making it a complete non
-starter
for e-discovery. Moreover, neither of these approaches employed true native forms, as both were
designed to support a different computing environment.
Gmail
63
41
Yes, you could conceivably open a fresh Gmail account for production, populate it with
responsive messages and turn over the access credentials for same to the requesting party.
That’s probably as close to true native as you can get (though some metadata will change), and
it flawlessly mirrors the functionality of the source. Still
, it’s not what most people expect or
want. It’s certainly not a form they can pull into their favorite e
-discovery review tool.
Alternatively, as the Court noted in Keaton v. Hannum, 2013 U.S. Dist. LEXIS 60519 (S.D. Ind.
Apr. 29, 2013), an IMAP
17
capture to a PST format (using Microsoft Outlook or a collection tool)
is a practical alternative. What you get will not look or work exactly like Gmail (i.e., messages
won’t thread in the same way and flagging will be different); but, it will supply a large mea
sure of
the functionality and completeness of the Gmail source. Plus, it’s a form that lends itself to many
downstream processing options.
So,
what’s the native form of e
-mail?
maybe it doesn’t matter.
, We should be less hung up on
the term “native” and instead specify the actual form or forms we seek that are best suited to
what we need and want to do with the data. That means understanding the differences between
the forms (e.g., what information they convey and their compatibility with your review tools), not
just demanding native like it’s a brand name.
When we
seek “native” for a Word document or an Excel spreadsheet, it’s because
we
recognize that the entire native file
—
and only the native file
—
supports the level of completeness
and functionality we need,
a level that can’t be fairly replicated in any other form. But when
we
seek native production of e-mail, we
can’t
expect to receive the entire “true” native file.
We
understand that responsive and privileged messages must be segregated from the broader
collection and that there are a variety of near-native forms in which the responsive subset can be
produced so as to closely mirror the completeness and functionality of the source. What matters
most is getting all the important information within and about the message in a fielded form that
doesn’t completely destroy its character as an e
-mail message.
So let’s not get
too literal about native when it comes to native e-
mail. Don’t seek native t
o
prove a point. Seek native to prove your case.
17
IMAP (for Internet Message Access Protocol) is another way that e-mail client and server applications
can talk to each another. The latest version of IMAP is described in RFC 3501. IMAP is not a form of e-
mail storage, but it is a means by which the structure (i.e., foldering) of webmail collections can be
replicated in local mail client applications like Microsoft Outlook. Another way that mail clients
communicate with mail servers is the Post Office Protocol or POP; however, POP is limited in important
ways, including in its inability to collect messages stored outside a user’s Inbox. Further, POP does not
replicate foldering. Outlook “talks” to Exchange
servers using MAPI and to other servers and webmail
services using MAPI (or via POP, if MAPI is not supported).
Just Get Forms that Function
41
42
Enterprises increasingly rely on complex databases to manage recordkeeping and business
processes such that at least some of the evidence in your case likely exists only as a value
derived by querying a database.
Requesting parties typically ignore databases altogether in discovery. Else, they often demand
entire databases, little thinking what such a demand will entail were it to succeed. If the
database is built in Microsoft Access or some other simple tool
, it’s feasible to acquire the
hardware and software licenses required to duplicate the producing party’s database
environment sufficiently to run the application. But, if the data sets require massive storage
resources or are built on enterprise-level database management systems (DBMS) like Oracle or
SAP, mirroring the environment is just about o
ut of the question. I say “
just about
” because the
emergence of Infrastructure-as-a-Service Cloud-based computing suggests the potential for
mere mortals to deploy enterprise-level computing environments on a pay-as-you-go basis.
A more likely production scenario is to narrow the data set by use of filters and queries, then
either export the responsive date to a format that can be analyzed in other applications (e.g.,
exported as extensible markup language (XML), comma separated values (CSV) or in another
delimited file) or run reports (standard or custom) and ensure that the reports emerge in a
fielded, delimited format that supports electronic search.
Before negotiating a form of production, investigate the capabilities of the DBMS. The database
administrator may not have had occasion to undertake a data export and so may have no clue
what an application can do much beyond the confines of what it does every day. It’s the rare
DBMS that can’t export delimited data. Next, have a proposed form of production in mind and, if
possible, be prepared to instruct the DBMS administrator how to secure the reporting or export
format you seek,
Remember that the resistance you experience in seeking to export to electronic formats may not
come from the opposing party or the DBMS administrator. More often, an insistence on reports
being produced as printouts or page images is driven by the needs of opposing counsel. In that
instance, it helps to establish that the export is feasible as early as possible.
As with other forms of e-discovery, be careful not to accept pro
duction in formats you don’t want
because, like-it-or-not, many Court give just one bite at the production apple. If you accept it on
a paper or as TIFF images for the sake of expediency, you often close the door on re-production
in more useful forms.
Eve
n if the parties can agree upon an electronic form of production, it’s nevertheless a good idea
to secure a test export to evaluate before undertaking a high volume export.
Production from Databases
C# Word - Word Conversion in C#.NET Word documents in .NET class applications independently, without using other external third-party dependencies like Adobe Acrobat. Word to PDF Conversion.
how to erase text in pdf; erase text from pdf
52
43
Few advocates obtain useful insight into database production capabilities from opposing
counsel. Often, opponents lack the grounding in DBMS needed to elicit and convey the
information, or the other side so fears your effort to
“
invade
” their
databases that they adopt an
uncooperative
—
at times, hostile
—
bunker mentality.
It’s the exceedingly rare case where discovery entails
one party gaining direct access to
an
other’s
databases. Unfortunately, it
’s nearly
as rare for databases to be competently queried
for discoverable content and their contents supplied in utile and complete forms.
By rights, database discovery should be one of the easiest and least contentious aspects of e-
discovery. In practice, it’s
a swamp.
To get database content produced in utile and complete forms requires requesting parties to
inquire into the structure of the data and the capabilities of the DBMS, particularly reporting and
export capabilities. To that end
—
and especially where
opposing counsel can’t or won’t
cooperate
—it’s prudent to depose persons knowledg
eable about the databases holding
potentially responsive information
The following notice of deposition is an example of topics selected to elicit information needed to
frame an efficient and effective database discovery effort.
PLEASE TAKE NOTICE that the deposition(s) of ABC Corporation pursuant to Fed. R.
Civ. P. 30(b)(6) will take place at date/time/location. The deposition, if not completed on
the noticed date, shall be continued, if necessary, from day to day thereafter, excluding
weekends and holidays, until completed. The deposition(s) will be conducted under the
supervision of an officer who is authorized to administer an oath and will be recorded
stenographically and on video.
Pursuant to Fed. R. Civ. P. 30(b)(6), ABC Corporation must designate and produce at
the deposition(s) for examination one or more "officers, directors, or managing agents, or
other persons who consent to testify" and who possess sufficient knowledge to testify as
to the Deposition Topics listed below.
DEFINITIONS AND INSTRUCTIONS
The following definitions and instructions apply to this Notice:
1. The phrase "DATABASE OR SYSTEM" refers to a devices and mechanisms to store,
access and retrieve data, including retired or "legacy" devices or systems, employed,
Gathering data on databases
NOTICE OF DEPOSITION(S) PURSUANT TO F.R.C.P. 30(b)(6)
32
44
purchased by, leased, accessed, queried, subscribed to, summarized, controlled and/or
provided to or obtained by you, including [list systems known to be of interest], or any
other relevant databases that have not specifically identified herein.
2. For each DATABASE or SYSTEM that holds potentially responsive information, we
seek to question the designated person(s) who, with reasonable particularity, can testify
on your behalf about information known to or reasonably available to you concerning the
Deposition Topics listed below.
3. The phrase “SUBJECT MATTER OF THE ACTION” means
[e.g., the claims made in
the operative complaint in this cause or other relevant subject matter];
3. Each deponent is instructed to produce at the deposition the requested information
items listed in Exhibit A to this Notice [i.e., subpoena duces tecum]..
DEPOSITION TOPICS
1. The standard reporting capabilities of the database or system, including the nature,
purpose, structure, appearance, format and electronic searchability of the information
conveyed within each standard report or reporting template that can be generated by the
database or system or by any overlay (e.g., third-party) reporting application(s);
2. The enhanced reporting capabilities of the database or system, including the nature,
purpose structure, appearance, format and electronic searchability of the information
conveyed within each enhanced or custom report (or template) that can be generated by
the database or system or by any overlay (e.g., third-party) reporting application;
3. The flat file and structured export capabilities of each database or system, particularly
the ability to export to fielded/delimited or structured formats in a manner that faithfully
reflects the content, integrity and functionality of the source data;
4. Other export and reporting capabilities of each database or system (including any
overlay reporting application) and how they may or may not be employed to faithfully
reflect the content, integrity and functionality of the source data for use in this litigation;
5. The structure of the database or system to the extent necessary to identify data within
potentially responsive fields, records and entities, including field and table names,
definitions, constraints and relationships, as well as field codes and field code/value
translation or lookup tables.
C# Excel - Excel Conversion & Rendering in C#.NET Excel documents in .NET class applications independently, without using other external third-party dependencies like Adobe Acrobat. Excel to PDF Conversion.
how to copy text out of a pdf; pdf text watermark remover
31
45
6. The query language, syntax, capabilities and constraints of the database or system
(including any overlay reporting application) as they may bear on the ability to identify,
extract and export potentially responsive data from each database or system;
7. The user experience and interface, including datasets, functionality and options
available for use by persons involved with the subject matter of the action;
8. The operational history of the database or system to the extent that it may bear on the
content, integrity, accuracy, currency or completeness of potentially responsive data;
9. The nature, location and content of any training, user or administrator manuals or
guides that address the manner in which the database or system has been administered,
queried or its contents reviewed by persons involved with the subject matter of the action;
10. The nature, location and contents of any schema, schema documentation (such as an
entity relationship diagram or data dictionary) or the like for any database or system that
may reasonably be expected to contain information relating to the subject matter of the
action;
11. The capacity and use of any database or system to log reports or exports generated
by, or queries run against, the database or system where such reports, exports or queries
may bear on the subject matter of the action;
12. The identity and roles of current or former employees or contractors serving as a
database or system administrator for databases or systems that may reasonably be
expected to contain (or have contained) information relating to the subject matter of the
action; and
13. The cost, burden, complexity, facility and ease with which the information within
databases and systems holding potentially responsive data relating to the subject matter
of the action may be identified, preserved, searched, extracted and produced in a manner
that faithfully reflects the content. integrity and functionality of the source data.
Once, forms of production hardly mattered.
Paper was paper.
Today, forms of production can spell the difference between winning and losing.
Forms of production matter.
Forms of Production Matter
46
46
Ask parties about the forms of ESI they use daily and it’s doubtful you’ll hear a peep about TIFF
images or load files. Parties
don’t use that junk;
only lawyers do. When clients create,
communicate and collaborate, they do it using forms geared to native applications with file
extensions like .XLSX, .DOCX, .PPTX, .MSG, etc. They choose and use functional and
complete native and near-native forms. Those are the forms witnesses consult to reconstruct
events and refresh their memories. Those are the forms witnesses recognize at deposition and
in trial.
Yet, too often, e-discovery is a bait and switch con game: We request
the parties’
modern data,
but receive
the lawyers’
dilapidated junk. Once that inequity dawns on everyone, perhaps we
will bid goodbye to wasting millions on senseless downgrading of ESI and ring in a new era of
hands-on analytics.
If you are a requesting party, it
’s time to take a hard look at the
language of the definitions and
instructions accompanying your requests for production. I
f you’re like most, you didn’t draft th
at
language from scratch. You borrowed some boilerplate from someone who borrowed it from
someone who drafted it in 1947. That hand-me-down verbiage is long past retirement age; so,
retire it and craft modern requests for a modern digital world.
We will never be less digital than we are today. We will never return to a world where paper is
the preferred medium of information storage and transfer. Never.
We must move forms of production upstream, from depleted images and load files to functional
native and near native forms retaining the content and structure that supports migration into any
form. So, i
sn’t
it time we demand modern evidence and obtain it in the forms in which it serves
us best? Utile forms. Complete forms. Forms that function.
Craig Ball of Austin is a Board Certified trial lawyer, certified computer forensic examiner, law
professor and electronic evidence expert. He limits his practice to serving as a court-appointed
special master and consultant in computer forensics and electronic discovery and has served as
the Special Master or testifying expert in computer forensics and electronic discovery in some of
the most challenging and celebrated cases in the U.S. For nine years, Craig penned the award-
winning Ball in Your Court column on electronic discovery for American Lawyer Media and now
writes for several national news outlets. For
Craig’s
articles on e-discovery and computer
forensics, please visit www.craigball.com or his blog, www.ballinyourcourt.com.
About the Author
51
48
Reprinted from Ball in Your Court, May 8, 2014
U.S. District Judge James Browning is a fine fellow. There are many reasons to say so; but the
first is that, though he sits in New Mexico, he was born in the Great State of Texas. Judge
Browning kindly spoke to my E-Discovery class at the law school in September 2012.
I’d sought
him out because he’d been ably grappling with e
-discovery issues in a case styled S2 v.
Micron. In his remarks to my class, he splendidly recounted some of the challenges faced by
judges who ascended to the bench before the Age of Digital Evidence. Judge Browning has one
of those C.V.s that could make any lawyer hate him (e.g., Yale, varsity letterman, Law Review
editor-in-
chief, Coif, Supreme Court clerk); but he’s a good judge and a nice guy to boot.
I share my admiration of Judge Browning to underscore that I feel a bit of a rat in expressing
misgivings about his recent opinion in The Anderson Living Trust v. WPX Energy Production,
LLC, No. CIV 12-0040 JB/LFG. (D. New Mexico March 6, 2014). I think he got it wrong in some
respects
–
not on the peculiar equities of the case before him, but in his broader analysis of Rule
34 of the Federal R
ules of Civil Procedure and in conjuring a Hobson’s choice for requesting
parties.
The Anderson Living Trust
case is a fight over gas leases; but the merits don’t matter.
As the
Court succinctly put it, the issue, is “whether the Defendants must arrange
and label
approximately 20,000 pages of documents stored in hard copy form which, at the Plaintiffs’
request, were scanned and produced as searchable PDF files….”
The Defendants planned to unilaterally convert the paper into TIFF images with OCR load files,
but acceded to Plaintiffs’ request that they supply searchable PDFs instead. So, the Defendants
converted paper records to TIFFs and then to searchable PDF images. Remember, at the start
of the litigation, the source evidence items were paper records, not electronically stored
information (ESI). Litigation alone prompted their conversion to crudely-searchable electronic
formats.
In stark contrast to ESI, paper documents are inherently unsearchable. Thus, paper records are
that rare form of evidence that is enhanced, rather than degraded, by conversion to page images
and by use of optical character recognition (OCR) to approximate searchable text. As it
happens, TIFF images cannot carry the text, but PDF images can. Think pants with pockets
versus skirts without pockets. When you use TIFF images for production, text has to
go somewhere
and, since TIFFs have no “pockets,” the text goes into a purse called a “load
file.”
Load files are meant to be loaded into a database called a review tool where, paired with
corresponding page images, non-searchable hard-copy documents acquire a rudimentary
searchability.
Because the searchable text was derived by OCR (as opposed to text extracted from an
electronic source), PDF wouldn’t outshine TIFF in terms of the ac
curacy of text
Appendix 1
Broken Badly: Anderson Living Trust v. WPX Energy Production
39
49
searchability. Both would be equally rife with errors, but both still better than
paper. Accordingly, the principal distinctions between the two image formats go to
convenience
—
you can search text in a PDF without messing with load files, and PDFs are more
compact than TIFFs despite holding both page images and text.
The Plaintiffs’ request for PDFs suggests they were seeking a form of production they could
manage without review tools; but, considering the source was paper and OCR, they gained little
by demanding PDF and, as it turned out, ceded quite a bit.
When Plaintiffs received the PDF production, they concluded they were unable to manage it
unless the Defendants either organized the documents as they’d been kept in the usual course
of business or indicated which items produced were responsive to which Request.
The defense furnished an index of their production but declined to do more, contending that,
because the paper documents were now deemed ESI, the Plaintiffs could no longer secure the
benefits of rules governing production of “documents.”
The dispute thus turned on how to apply Fed. R. Civ. P. Rule 34(b)(2)(E), which provides:
Producing the Documents or Electronically Stored Information. Unless otherwise
stipulated or ordered by the court, these procedures apply to producing documents or
electronically stored information:
(i) A party must produce documents as they are kept in the usual course of business or
must organize and label them to correspond to the categories in the request;
(ii) If a request does not specify a form for producing electronically stored information, a
party must produce it in a form or forms in which it is ordinarily maintained or in a
reasonably usable form or forms; and
(iii) A party need not produce the same electronically stored information in more than one
form.
The Defendants made a compelling case, insisting they had: (i) produced the documents “as
they are kept in the usual course of business”; (ii) “provided information about the particular way
in whi
ch the documents are ordinarily maintained”; (iii) “provided an index identifying documents
by category”; and (iv) “produced the documents so that they are fully searchable.”
The
Defendants added the coups de grâce that organizing the production to correlate with the
discovery would be a lot of effort that was unlikely to benefit anyone. They well played the
proportionality card (although the word “proportionality” appears nowhere in the opinion).
Still, the Court agreed with the Plaintiffs and directed the Defendants to label their responsive
documents to correspond to the Plaintiffs’ requests.
The ruling was a model of stare
Documents you may be interested
Documents you may be interested