Generative AI in Practice: Copyright and Data Protection Considerations

Emily Maass and Leigh Gill, Immix Law

In December 2024, Apple released a new version of its mobile operating system—iOS 18.2, which includes Apple Intelligence and offers a ChatGPT extension. While this may sound like a tech story rather than a legal one, just below the humdrum march of technology are important legal considerations. Though artificial intelligence (AI) has been around for generations in an academic context, it has only recently become mainstream with important considerations for business attorneys. Before you click “update,” it is important to consider both the copyright and data protection implications of incorporating this technology into your practice.

Copyright and AI

In the two years since the release of ChatGPT, technology lawyers have seen the ubiquity of AI tools (and the adoption of those tools) outpace the developing law in the field of copyright. There are two key issues:

  • What is the propriety of using someone’s copyrighted work to train an AI?
  • Who owns the output of the AI tool?

These issues are intertwined, and there are currently more questions than answers. Ongoing litigation is extensive on the first issue, and in most cases the defendants (developers of AI tools) are asserting a fair use defense. Mark Lemley, a tech law luminary, has opined that permitting copying of works for a non-expressive purpose such as training an AI is consistent with copyright law’s objectives. If successful, a fair use defense would help reduce the universe of possible answers to the second question, but it wouldn’t answer the question.

The generative AI on the market is powered by machine learning algorithms, which means that the output is dependent on patterns found within large databases of information. For example, chatbots and spelling suggestions on your phone produce each word in a sentence as predicted by the sequence of words preceding and matched against a database of similar content. AI databases are typically black boxes, and there’s no clarity as to which copyrighted works may be in the database. Extensive litigation is ongoing—authors and publishers assert their copyrighted works are infringed by inclusion in the database. Tech companies respond that databases are transformative, the output doesn’t match the input, and any use of copyrighted works is fair use and non-infringing.

For consumers (including lawyers) who choose to use AI tools to generate new content, there is a somewhat separate question of ownership in the resulting work. If the content owners are successful in proving infringement, they could also assert that output of the tools is a derivative work in which they have rights. If the fair use defense is successful, the technology companies may claim ownership in the output. (Read your terms of use—commonly used AI tools typically do not claim ownership from users. Microsoft’s tools claim only limited use of customer data and allow users to own the output of its Copilot product, even going so far as offering to defend copyright claims arising from use of Copilot.) Only time and extensive litigation will determine whether fair use applies.

Agencies responsible for the administration of intellectual property laws have been quicker than courts to provide guidance, but there remains significant uncertainty. The U.S. Copyright Office has stated that it will issue a copyright registration to a human author who provides a work that was generated with AI tools only if the human (and not the AI) selected, arranged, and otherwise created the expression. The Copyright Office has refused registration for works that were machine created, regardless of how many programming decisions were involved in directing that machine to produce the output.

Guidance from the Copyright Office distinguishes between “assistive uses” of AI systems and “prompt engineering” on page eighteen of their copyrightability report: “The Office concludes that…prompts alone do not provide sufficient human control to make users of an AI system the authors of the output….While highly detailed prompts could contain the user’s desired expressive elements, at present they do not control how the AI system processes them in generating the output.” Quite apart from the ethical issues of using this developing technology in practice, if a lawyer uses a machine to produce a work product, there are no rights of authorship in that work product.

Legal AI and data protection

Attorneys are not immune from the pressure to incorporate AI into the tools of our trade. A quick online search lists dozens of tools claiming to leverage AI to make your practice faster, better, smarter, and more profitable than opposing counsel. Attorneys are expected to maintain competence with technology in their legal practice, and a firm’s comfort with adopting new technology can be determinative of its capacity for growth and longevity in an increasingly challenging legal market.

While this drive to innovate is nothing new for the legal profession, neither is the persistent nagging concern of how innovation may clash with our age-old promise to preserve client confidential information. Not all AI is created equal, and advertising a technology tool as “AI for lawyers” does not guarantee that the developers offer a product that can stand up to an attorney’s obligations to their clients. When considering adoption of a given AI tool, the question of whether it is designed to support a lawyer’s confidentiality obligations should be top of mind.

AI tools are frequently black boxes with respect to data provenance and disposition. This places a heavy due diligence burden on the law practice to thoroughly understand how the AI tool was trained and how the AI tool will use and protect the practice’s data once it is entrusted to the AI tool. As a starting point, consider these questions when performing due diligence on a potential new AI tool for your practice:

  • What data is used as training data for the AI tool? Can the vendor confirm that it was lawfully obtained and can be used by you for any purpose without infringing on the rights of third parties?
  • Does the vendor grant itself a broad license to use your data or disclose it to third parties? Check the terms and conditions, which are typically not up for negotiation.
  • Will your data be segregated on the AI tool’s systems, or combined with other users’ data?
  • Will the data you put into the AI tool (e.g., details about your practice, cases, work product) be used as training data?
  • Is it possible that your data (or your client’s data) could appear in another user’s output?

Frustratingly, it’s not uncommon for these questions to be met with somewhat vague responses that beg even more questions. If you choose to incorporate AI into your practice, there are some steps you can take to help safeguard your data and your clients’ confidentiality:

  • Adjust your software settings to prevent the AI tool from running constantly in the background or otherwise automatically collecting data from your email, phone calls, or other device applications where you input, process, or store sensitive or confidential information.
  • Turn off the AI tool’s “wake word” or any other setting where the AI tool tries to guess that it should start recording, to prevent any unintended collection of data. Configure your privacy settings so that you are required to turn on the AI tool directly. (See Lopez et al v. Apple Inc., Case No. 5:19-cv-04577.)
  • Avoid inputting confidential information into an AI tool unless the vendor can provide you with legally binding assurance that the AI tool is expressly designed for the practice of law and safeguarding sensitive data.
  • Always notify your clients and obtain their consent before using an AI assistant in meetings or conversations, recording a phone call, or using other AI tools when working with their confidential information.
  • Regularly check your software and devices for recordings or other data storage from AI tools to confirm that the AI tool is only collecting data when directly prompted by you.
  • Configure your settings to automatically delete your data at regular intervals (e.g., every thirty days). Confirm that the deletion is permanent and that your data is not being stored elsewhere on the vendor’s systems.
  • Beware of relying too heavily on AI-generated outputs. Outputs containing factual statements, quotations, or citation might be the result of an AI hallucination. Also, remember that AI outputs are only as good as the training data used to develop the tool and the clarity of your prompt.

Lawyers must be aware of newly developing technology, and they have a duty of competence in the tools they use. AI is everywhere, and it is bound to become a key underlying technology in the practice of law. In these days of early adoption, attorneys must take care when selecting AI-driven technology solutions, with a focus on client confidentiality and quality work product. AI tools can be a valuable resource, but they are not a substitute for good, careful lawyering. ♦