人工智能论文英文版-Hey, That’s My Data! Label-Only Dataset Inference in Large Language Models.pdfVIP

  • 1
  • 0
  • 约11.55万字
  • 约 15页
  • 2025-06-13 发布于湖南
  • 举报

人工智能论文英文版-Hey, That’s My Data! Label-Only Dataset Inference in Large Language Models.pdf

Hey,That’sMyData!Label-OnlyDatasetInferenceinLargeLanguageModels

12314

ChenXiong,ZihaoWang,RuiZhu,Tsung-YiHo,Pin-YuChen,

523

JingweiXiong,HaixuTang,LucilaOhno-Machado

1TheChineseUniversityofHongKong,2IndianaUniversityBloomington,3YaleUniversity,SchoolofMedicine,

4IBMResearchAI,5UniversityofCalifornia,Davis

5Abstract—LargeLanguageModels(LLMs)haverevolutionizedNonetheless,thissurgeindevelopmenthasalsosparked

2NaturalLanguageProcessingbyexcellingatinterpreting,debatesregardingunauthorizedusageofdata.Inparticular,

0reasoningabout,andgeneratinghumanlanguage.However,copyrightedcontentmaybeembeddedinmodeltraining

2theirrelianceonlarge-scale,oftenproprietarydatasetsposesasetswithoutproperconsent,infringingontherightsof

ncriticalchallenge:unauthorizedusageofsuchdatacanleadtoauthorsandpotentiallyleadingtofinancialharm[23],[34],

ucopyrightinfringementandsignificantfinancialharm.Existing[44].AnotableexampleistheNewYorkTimesfiling

Jdataset-inferencemethodstypicallydependonlogprobabilitiesalawsuitagainstOpenAIandMicrosoftoverthealleged

6todetectsuspicioustrainingmaterial,yetmanyleadingLLMsimproperuseoftheircopyrightedmaterialsfortraining

havebegunwit

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档