Unlike the U.S. and E.U., Japan has decided not to enforce copyright law on material used to train generative AI (gen AI) models. This bold approach may change; however, this is the Japanese government’s current stance as Japan plays catch-up in the gen AI arms race.
Japan’s gen AI policy permits AI models to process any data “regardless of whether it is for non-profit or commercial purposes, whether it is an act other than reproduction, or whether it is content obtained from illegal sites or otherwise.”
In an April 2023 blog post, Keiko Nagaoka, the Japanese Minister of Education, Culture, Sports, Science, and Technology, confirmed the nation’s policy. According to lawmaker Takashi Kii, Nagaoka is on record as saying, “It is possible to use the work for information analysis—regardless of the method, regardless of the content.”
Hence, as long as companies use copyrighted content solely to train gen AI models, this behavior appears to be safe from regulatory action. That said, LLM training involves several different elements, each of which are worth discussing in turn.
Japan’s take on LLM training, usage, and copyright issues
Firstly, we should differentiate between “model training” and “model usage.” Model training involves the creation, training, and fine-tuning of a large language model. Whereas, model usage involves user prompts (e.g., voice, text, or image inputs), as well as the AI output based on those user prompts.
In 2018, the Copyright Act of Japan (1970) was amended to create accommodating provisions for AI training. As lawyers from Nishimura & Ashi explain, this 2018 amendment made [Japanese copyright law] “one of the more ‘relaxed’ copyright acts in the world under which business would likely be allowed to use copyrighted training data for AI development unless exceptional requirements are met by those claiming infringement.”