|
目录一、用法精讲16、pandas.DataFrame.to_json函数16-1、语法16-2、参数16-3、功能16-4、返回值16-5、说明16-6、用法16-6-1、数据准备16-6-2、代码示例16-6-3、结果输出17、pandas.read_html函数17-1、语法17-2、参数17-3、功能17-4、返回值17-5、说明17-6、用法17-6-1、数据准备17-6-2、代码示例17-6-3、结果输出 18、pandas.DataFrame.to_html函数18-1、语法18-2、参数18-3、功能18-4、返回值18-5、说明18-6、用法18-6-1、数据准备18-6-2、代码示例18-6-3、结果输出 二、推荐阅读1、Python筑基之旅2、Python函数之旅3、Python算法之旅4、Python魔法之旅5、博客个人主页一、用法精讲16、pandas.DataFrame.to_json函数16-1、语法#16、pandas.DataFrame.to_json函数DataFrame.to_json(path_or_buf=None,*,orient=None,date_format=None,double_precision=10,force_ascii=True,date_unit='ms',default_handler=None,lines=False,compression='infer',index=None,indent=None,storage_options=None,mode='w')ConverttheobjecttoaJSONstring.NoteNaN’sandNonewillbeconvertedtonullanddatetimeobjectswillbeconvertedtoUNIXtimestamps.Parameters:path_or_bufstr,pathobject,file-likeobject,orNone,defaultNoneString,pathobject(implementingos.PathLike[str]),orfile-likeobjectimplementingawrite()function.IfNone,theresultisreturnedasastring.orientstrIndicationofexpectedJSONstringformat.Series:defaultis‘index’allowedvaluesare:{‘split’,‘records’,‘index’,‘table’}.DataFrame:defaultis‘columns’allowedvaluesare:{‘split’,‘records’,‘index’,‘columns’,‘values’,‘table’}.TheformatoftheJSONstring:‘split’:dictlike{‘index’->[index],‘columns’->[columns],‘data’->[values]}‘records’:listlike[{column->value},…,{column->value}]‘index’:dictlike{index->{column->value}}‘columns’:dictlike{column->{index->value}}‘values’:justthevaluesarray‘table’:dictlike{‘schema’:{schema},‘data’:{data}}Describingthedata,wheredatacomponentislikeorient='records'.date_format{None,‘epoch’,‘iso’}Typeofdateconversion.‘epoch’=epochmilliseconds,‘iso’=ISO8601.Thedefaultdependsontheorient.Fororient='table',thedefaultis‘iso’.Forallotherorients,thedefaultis‘epoch’.double_precisionint,default10Thenumberofdecimalplacestousewhenencodingfloatingpointvalues.Thepossiblemaximalvalueis15.Passingdouble_precisiongreaterthan15willraiseaValueError.force_asciibool,defaultTrueForceencodedstringtobeASCII.date_unitstr,default‘ms’(milliseconds)Thetimeunittoencodeto,governstimestampandISO8601precision.Oneof‘s’,‘ms’,‘us’,‘ns’forsecond,millisecond,microsecond,andnanosecondrespectively.default_handlercallable,defaultNoneHandlertocallifobjectcannototherwisebeconvertedtoasuitableformatforJSON.Shouldreceiveasingleargumentwhichistheobjecttoconvertandreturnaserialisableobject.linesbool,defaultFalseIf‘orient’is‘records’writeoutline-delimitedjsonformat.WillthrowValueErrorifincorrect‘orient’sinceothersarenotlist-like.compressionstrordict,default‘infer’Foron-the-flycompressionoftheoutputdata.If‘infer’and‘path_or_buf’ispath-like,thendetectcompressionfromthefollowingextensions:‘.gz’,‘.bz2’,‘.zip’,‘.xz’,‘.zst’,‘.tar’,‘.tar.gz’,‘.tar.xz’or‘.tar.bz2’(otherwisenocompression).SettoNonefornocompression.Canalsobeadictwithkey'method'settooneof{'zip','gzip','bz2','zstd','xz','tar'}andotherkey-valuepairsareforwardedtozipfile.ZipFile,gzip.GzipFile,bz2.BZ2File,zstandard.ZstdCompressor,lzma.LZMAFileortarfile.TarFile,respectively.Asanexample,thefollowingcouldbepassedforfastercompressionandtocreateareproduciblegziparchive:compression={'method':'gzip','compresslevel':1,'mtime':1}.Newinversion1.5.0:Addedsupportfor.tarfiles.Changedinversion1.4.0:Zstandardsupport.indexboolorNone,defaultNoneTheindexisonlyusedwhen‘orient’is‘split’,‘index’,‘column’,or‘table’.Ofthese,‘index’and‘column’donotsupportindex=False.indentint,optionalLengthofwhitespaceusedtoindenteachrecord.storage_optionsdict,optionalExtraoptionsthatmakesenseforaparticularstorageconnection,e.g.host,port,username,password,etc.ForHTTP(S)URLsthekey-valuepairsareforwardedtourllib.request.Requestasheaderoptions.ForotherURLs(e.g.startingwith“s3://”,and“gcs://”)thekey-valuepairsareforwardedtofsspec.open.Pleaseseefsspecandurllibformoredetails,andformoreexamplesonstorageoptionsreferhere.modestr,default‘w’(writing)SpecifytheIOmodeforoutputwhensupplyingapath_or_buf.Acceptedargsare‘w’(writing)and‘a’(append)only.mode=’a’isonlysupportedwhenlinesisTrueandorientis‘records’.Returns:NoneorstrIfpath_or_bufisNone,returnstheresultingjsonformatasastring.OtherwisereturnsNone.16-2、参数16-2-1、path_or_buf(可选,默认值为None):字符串或文件对象,如果为字符串,表示JSON数据将被写入该路径的文件中;如果为文件对象,则数据将被写入该文件对象;如果为None,则返回生成的JSON格式的字符串。16-2-2、orient(可选,默认值为None):字符串,用于指示JSON文件中数据的期望格式。16-2-2-1、'split':字典像{index->[index],columns->[columns],data->[values]}。16-2-2-2、'records':列表像[{column->value},...,{column->value}]。16-2-2-3、'index':字典像index->{column->value}},其中索引是JSON对象中的键。16-2-2-4、'columns':字典像{{column->index}->value}。16-2-2-5、'values':仅仅是值数组。16-2-2-6、如果没有指定,Pandas会尝试自动推断。16-2-3、date_format(可选,默认值为None):字符串,用于日期时间对象的格式。默认为 None,意味着使用ISO8601格式。16-2-4、double_precision(可选,默认值为10):整数,指定浮点数的精度(小数点后的位数)。16-2-5、force_ascii(可选,默认值为True):布尔值,是否确保所有非ASCII字符都被转义。16-2-6、date_unit(可选,默认值为'ms'):字符串,用于时间戳的时间单位,'s','ms','us','ns'分别代表秒、毫秒、微秒、纳秒。16-2-7、default_handler(可选,默认值为None):可调用对象,用于处理无法转换为JSON的对象。默认为None,此时会抛出TypeError。16-2-8、lines(可选,默认值为False):布尔值,如果为True,则输出将是每行一个记录的JSON字符串的列表。16-2-9、compression(可选,默认值为'infer'):字符串或None,指定用于写入文件的压缩方式。'infer'(默认)会根据文件扩展名自动选择压缩方式(如.gz)。16-2-10、index(可选,默认值为None):布尔值或字符串列表,是否将索引作为JSON的一部分输出。如果为False,则不输出索引;如果为True,则输出所有索引;如果为字符串列表,则只输出指定的索引。16-2-11、indent(可选,默认值为None):整数或None,指定输出JSON字符串的缩进量。如果为None,则不进行缩进。16-2-12、storage_options(可选,默认值为None):字典,用于文件存储的额外选项,如AWSS3访问密钥。16-2-13、mode(可选,默认值为'w'):字符串,'w'表示写入模式(如果文件存在则覆盖),'a'表示追加模式。16-3、功能 将PandasDataFrame对象转换为JSON格式的数据,并可以选择性地将其写入文件或作为字符串返回。16-4、返回值16-4-1、如果path_or_buf参数为None(默认值),则函数返回一个包含JSON数据的字符串。16-4-2、如果path_or_buf参数被指定为一个文件路径或文件对象,则函数不返回任何值(即返回None),而是将JSON数据写入指定的文件或文件对象。16-5、说明 该函数在数据分析和数据科学中非常有用,特别是当你需要将DataFrame的内容导出到前端应用程序、Web服务或进行跨语言的数据交换时。16-6、用法16-6-1、数据准备无16-6-2、代码示例#16、pandas.DataFrame.to_json函数#16-1、直接输出importpandasaspddf=pd.DataFrame({'A':[1,2,3],'B':[4,5,6]})json_str=df.to_json(orient='records')print(json_str)#输出:[{"A":1,"B":4},{"A":2,"B":5},{"A":3,"B":6}]#16-2、写入文件importpandasaspddf=pd.DataFrame({'A':[1,2,3],'B':[4,5,6]})df.to_json('data.json',orient='records',lines=True)#在Python脚本所在目录自动生成data.json文件,文件中包含了JSON数据16-6-3、结果输出#16、pandas.DataFrame.to_json函数#16-1、直接输出#[{"A":1,"B":4},{"A":2,"B":5},{"A":3,"B":6}]#16-2、写入文件#在Python脚本所在目录自动生成data.json文件,文件中包含了JSON数据17、pandas.read_html函数17-1、语法#17、pandas.read_html函数pandas.read_html(io,*,match='.+',flavor=None,header=None,index_col=None,skiprows=None,attrs=None,parse_dates=False,thousands=',',encoding=None,decimal='.',converters=None,na_values=None,keep_default_na=True,displayed_only=True,extract_links=None,dtype_backend=_NoDefault.no_default,storage_options=None)ReadHTMLtablesintoalistofDataFrameobjects.Parameters:iostr,pathobject,orfile-likeobjectString,pathobject(implementingos.PathLike[str]),orfile-likeobjectimplementingastringread()function.ThestringcanrepresentaURLortheHTMLitself.Notethatlxmlonlyacceptsthehttp,ftpandfileurlprotocols.IfyouhaveaURLthatstartswith'https'youmighttryremovingthe's'.Deprecatedsinceversion2.1.0assinghtmlliteralstringsisdeprecated.Wrapliteralstring/bytesinputinio.StringIO/io.BytesIOinstead.matchstrorcompiledregularexpression,optionalThesetoftablescontainingtextmatchingthisregexorstringwillbereturned.UnlesstheHTMLisextremelysimpleyouwillprobablyneedtopassanon-emptystringhere.Defaultsto‘.+’(matchanynon-emptystring).Thedefaultvaluewillreturnalltablescontainedonapage.ThisvalueisconvertedtoaregularexpressionsothatthereisconsistentbehaviorbetweenBeautifulSoupandlxml.flavor{“lxml”,“html5lib”,“bs4”}orlist-like,optionalTheparsingengine(orlistofparsingengines)touse.‘bs4’and‘html5lib’aresynonymouswitheachother,theyareboththereforbackwardscompatibility.ThedefaultofNonetriestouselxmltoparseandifthatfailsitfallsbackonbs4+html5lib.headerintorlist-like,optionalTherow(orlistofrowsforaMultiIndex)tousetomakethecolumnsheaders.index_colintorlist-like,optionalThecolumn(orlistofcolumns)tousetocreatetheindex.skiprowsint,list-likeorslice,optionalNumberofrowstoskipafterparsingthecolumninteger.0-based.Ifasequenceofintegersorasliceisgiven,willskiptherowsindexedbythatsequence.Notethatasingleelementsequencemeans‘skipthenthrow’whereasanintegermeans‘skipnrows’.attrsdict,optionalThisisadictionaryofattributesthatyoucanpasstousetoidentifythetableintheHTML.ThesearenotcheckedforvaliditybeforebeingpassedtolxmlorBeautifulSoup.However,theseattributesmustbevalidHTMLtableattributestoworkcorrectly.Forexample,attrs={'id':'table'}isavalidattributedictionarybecausethe‘id’HTMLtagattributeisavalidHTMLattributeforanyHTMLtagasperthisdocument.attrs={'asdf':'table'}isnotavalidattributedictionarybecause‘asdf’isnotavalidHTMLattributeevenifitisavalidXMLattribute.ValidHTML4.01tableattributescanbefoundhere.AworkingdraftoftheHTML5speccanbefoundhere.Itcontainsthelatestinformationontableattributesforthemodernweb.parse_datesbool,optionalSeeread_csv()formoredetails.thousandsstr,optionalSeparatortousetoparsethousands.Defaultsto','.encodingstr,optionalTheencodingusedtodecodethewebpage.DefaultstoNone.``None``preservesthepreviousencodingbehavior,whichdependsontheunderlyingparserlibrary(e.g.,theparserlibrarywilltrytousetheencodingprovidedbythedocument).decimalstr,default‘.’Charactertorecognizeasdecimalpoint(e.g.use‘,’forEuropeandata).convertersdict,defaultNoneDictoffunctionsforconvertingvaluesincertaincolumns.Keyscaneitherbeintegersorcolumnlabels,valuesarefunctionsthattakeoneinputargument,thecell(notcolumn)content,andreturnthetransformedcontent.na_valuesiterable,defaultNoneCustomNAvalues.keep_default_nabool,defaultTrueIfna_valuesarespecifiedandkeep_default_naisFalsethedefaultNaNvaluesareoverridden,otherwisethey’reappendedto.displayed_onlybool,defaultTrueWhetherelementswith“display:none”shouldbeparsed.extract_links{None,“all”,“header”,“body”,“footer”}Tableelementsinthespecifiedsection(s)with
|
|