From AI chatbots to social media: data collection in the digital age (2025)

  • Companies
  • Cozen O'Connor

April 16, 2025 - The advent of new software applications may pose challenges for collections in the course of discovery, in addition to the challenges faced at the identification and preservation stages of the EDRM. Attorneys, and importantly, the technologists they work with need to be aware of how to defensibly collect data from these new applications and understand the limitations posed by these technologies so they can appropriately detail collections efforts to the court if needed.

What follows are best tips and advice for collecting those novel data sources, from AI chatbots to social media applications detailed in February's article.

Sign up here.

While it may be tempting to have a client simply take a screen shot of relevant data, this particular method of collection does not preserve metadata and certainly calls into question evidence's chain of custody and authenticity. As such, it is important to understand the appropriate ways to forensically collect data sources so chain of custody is maintained, metadata remains intact, authenticity can withstand challenge, and you can effectively converse with an e-discovery vendor.

Note that much of the advice detailed below is best employed by an e-discovery vendor skilled at forensically collecting a wide variety of data sources, but understanding the process will ensure an accurate representation to your client and the court.

When it comes to AI chatbots, third-party messaging applications, and social media platforms, there are several steps involved in collecting data. First and foremost, understanding the type of account that needs to be collected is extremely helpful. Most importantly, it is helpful to know if the collection is of a personal account, a business account, a free account or a paid account, as collection methods can differ based on account type.

For a personal account, exports will require having the user's username and password as well as access to the individual's device and a means of communicating any two-factor authentication required by the application. Most social media accounts, including Facebook, Instagram, X, Snapchat, TikTok, Mastodon, Bluesky, and Threads, will require two-factor authentication if this privacy setting is enabled in the application.

Additionally, the export of personal or free versions of an application may be more limited than the exports available for a business or paid account. For example, Slack provides export capabilities of public channels for free accounts whereas Business+ accounts allow the export of both public and private channels.

For a business account, you will likely need to gain administrative level access to the application through a company's IT professional or person with the most knowledge of the application in question. Corporate or business accounts can have different tiers or levels of administrative controls, which are dependent on subscriptions or plans.

For example, different from a personal Teams account, Teams for business can be managed through a Global or Teams Administrator, and WhatsApp Business has administrators who set appropriate access levels for users. For most applications, enterprise level products often have a friendlier user interface for bulk exporting data. They may also have choices for the format of the export. Lesser plans for a product may have limited export functionality, or none at all.

As a second step, it is helpful to understand the reporting and exporting functions of these applications in general. Most applications, such as ChatGPT, Copilot, and Gemini, allow you to filter data of interest or run some type of export or report. However, there are products that have limited or no capabilities for exporting data.

For example, Perplexity AI and Jasper AI do not offer an export feature. While Perplexity does allow you to export research reports and Perplexity pages to PDF or as a link, the application does not offer a built-in feature to directly export chat data. Jasper also does not have a built-in export feature for chat data, and while message boards have suggested copying and pasting the content desired for export to another location, such as Notepad, TextPad, or Word, these methods offer chain of custody and authenticity issues much like a screenshot.

There are third party solutions, such as "Save My Chatbot," which can export chatbot threads into formatted Markdown files; however, these solutions have not been vetted and are thus not considered forensically sound methods of data collection.

Other products have such strong encryption or protections on data that it is either very difficult or impossible to get data exported from them. One specific example is Signal. While these applications are the exception and not the rule, they do exist, and you must confer with an e-discovery professional to discuss what is and is not possible for collection.

It is important for legal professionals to understand the nuances of data collection from various digital platforms, including AI chatbots and social media applications. While forensic collections ensure the integrity and authenticity of collected data, new applications can pose challenges for even the most seasoned e-discovery vendor. Regardless, by understanding the processes for collections, following best practices and collaborating with e-discovery professionals, attorneys can effectively navigate the complexities of modern data collection.

Nicole Gill is a regular contributing columnist on e-discovery for Reuters Legal News and Westlaw Today.

Opinions expressed are those of the author. They do not reflect the views of Reuters News, which, under the Trust Principles, is committed to integrity, independence, and freedom from bias. Westlaw Today is owned by Thomson Reuters and operates independently of Reuters News.

Nicole Gill

Nicole Gill, chair and managing member of CODISCOVR, an e-discovery and information governance practice within Cozen O’Connor, develops tailored discovery management strategies and oversees complex, high-profile e-discovery projects using advanced technologies. She negotiates agreements on ESI preservation, directs AI use in discovery, and advises on data management and compliance with privacy laws across various jurisdictions. She is located in Philadelphia and can be reached at ngill@cozen.com.

Stephen Johnson

Stephen Johnson, assistant director of CODISCOVR, an e-discovery and information governance practice within Cozen O’Connor, routinely provides technological and organizational expertise in electronic discovery matters across all practice groups, and has specialties in customer service, optimizing operational processes and implementing innovative solutions. He can be reached at stephenjohnson@cozen.com.

From AI chatbots to social media: data collection in the digital age (2025)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Francesca Jacobs Ret

Last Updated:

Views: 6041

Rating: 4.8 / 5 (68 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Francesca Jacobs Ret

Birthday: 1996-12-09

Address: Apt. 141 1406 Mitch Summit, New Teganshire, UT 82655-0699

Phone: +2296092334654

Job: Technology Architect

Hobby: Snowboarding, Scouting, Foreign language learning, Dowsing, Baton twirling, Sculpting, Cabaret

Introduction: My name is Francesca Jacobs Ret, I am a innocent, super, beautiful, charming, lucky, gentle, clever person who loves writing and wants to share my knowledge and understanding with you.