What the Zondo Commission’s Report is teaching us about big data and eDiscovery

Nearly four years after the establishment of the Judicial Commission of Inquiry into Allegations of State Capture, Corruption and Fraud in the Public Sector (more commonly referred to as the Zondo Commission), the formal handover of the first part of the report took place on 4 January 2022.

The report is 900 pages long and, as we’ve previously mentioned in our articles, the investigation itself generated more than 1 exabyte of data as evidence. That’s one billion gigabytes.

To try and put that into context, let’s look at how much data typically makes up just one gigabyte:

Text files: Nearly 678 000 pages per gigabyte.

Emails: More than 100 000 pages.

Microsoft Word files: Almost 65 000 pages.

PowerPoint Slide Decks: Roughly 17 500 slides.

Images: Close to 15 500 images.

It’s for this reason that the Commission had to rely on eDiscovery or digital document review technology, mainly as a digital document review/investigative tool. This is despite the fact that eDiscovery has not yet been officially regulated in South Africa – although we expect that to change relatively soon. The South Africa Law Reform Commission (SALRC) recently published another discussion paper, subsequent to its first document in 2014, that agrees with the recommendation that the Rules of Court should be amended to make eDiscovery compulsory in this country.

When there are millions of electronic documents in a case (and with big data exploding, the Zondo Commission is simply an excellent example of what every legal case is experiencing), technology is the only answer.

According to a recent study by the Association of Certified E-Discovery Specialists (ACEDS), the average case contains 6.5 M Pages, 10 to 15 Custodians, and 130 GB of data. That’s the equivalence of 100 truckloads of data – times 100.

Where the Zondo Commission is perhaps more unique, is in terms of its size and the fact that multiple legal teams and review teams are working alongside each other. When so many teams work together, governance becomes paramount and once again eDiscovery technology can assist teams to understand exactly who is doing what, where data is captured and stored (adhering to all regulations and governance), who has access to data, and above all, that the data remains defensible and all required discovery is handed over.

So, in light of the fact that digital document review and evidence management is so integral to this case and that we expect local regulations to soon support eDiscovery, let’s take a look at eDiscovery basics and why technology is playing such an increasingly important role in the legal landscape.

eDiscovery in a world of big data

eDiscovery (or electronic discovery) is the identification, collection and production of electronically stored information (ESI). eDiscovery therefore directly impacts how ESI is archived, how storage systems are managed, the ability to search for relevant content, and the ability to modify content deletion policies.

If we consider the sheer volume of ESI, which includes documents, emails, databases, presentations, texts, WhatsApps, voicemail, audio and video files, collaboration message boards, social media, and web sites, it’s clear that it is impossible to meet the discovery and review requirements of a world of big data without adequate processes in place and technology that can handle gigabytes of data.

The importance of eDiscovery and digital document review should also not be underestimated as it has significant implications for how organisations retain, store and manage their electronic content. For example, original content and metadata for ESI must be preserved in order to eliminate claims of spoliation or tampering with evidence later in litigation. This is extremely difficult to achieve without eDiscovery policies and processes in place, as well as partnerships with key technology providers.

Here’s what the eDiscovery process looks like in practice:

  • When litigation begins, data is identified by both parties on either side of the matter.
  • Potentially relevant hardcopy documents and ESI are placed under a legal hold, which means they cannot be deleted or modified.
  • This data is collected and then extracted, indexed and placed into a database.
  • The sheer volume of data in any case requires the data to be analysed, separated and culled. This can be achieved with the use of Technology Assisted Review, or TAR.
  • All relevant data is indexed and hosted in a secure environment.
  • Reviews now have access to it.
  • Privileged and non-relevant information must be redacted, and so relevant documents are converted into a static format, such as TIFF and PDF files for production.
  • The ultimate goal of eDiscovery is to produce a core volume of evidence for litigation in a defensible manner.

Best practice in eDiscovery

In order for organisations and legal teams to benefit from digital document review (and because the complex nature of data and multiple teams working with the same data is driving the need for greater governance), it’s important to develop a digital document review strategy. For legal teams, understand how these strategies should look and support your clients as they develop them:

  • Create a digital document review strategy: Policies, practices, procedures and technologies are essential components of a robust review/investigation strategy. Specifically, data retention and deletion schedules should be in place.
  • Focus on employee involvement: It is essential to educate employees about the critical importance of using corporate communication and collaboration resources in accordance with policies, retaining important content, and taking care not to delete important documents. This will improve the quality of the organisation’s data and support eDiscovery and digital document review should it be required.
  • Ensure that IT and legal understand each other: eDiscovery and digital document review is both a legal and technology-driven process. This means that multiple teams will be working on the same data. It is important that everyone is aware of the policies and procedures in place and how various departments approach data (and what is important to them).
  • Prioritise eDiscovery and digital document review: Everyone is responsible for the data they create, store and manage, which means management should be tracking how employees handle data. This is also impacted by the Protection of Personal Information Act (POPIA) and should be front and centre for every organisation.
  • Implement deletion policies: Which data you preserve is as important as what is deleted. Without good data deletion policies, organisations retain more information than is necessary, creating more and unnecessary liability. Considering how much data is in an average case, this can lead to lengthy and extremely expensive review processes, not to mention data storage costs. Data classification is an important step here because decision makers must define what needs to be retained, what can safely be deleted, and the disposition method to be used.
  • Implement the right technologies: archiving, storage, predictive coding, and so on are all technology-based – with the amount of data businesses and legal teams are no contending with, this simply cannot be achieved manually. With the correct technologies in place however, it is possible to ensure that all necessary data is accessible and reviewable early in a legal case.

Disclaimer – we have not been involved in the Zondo Commission and our comments are based on information from the public and industry in general.