Python酷库之旅-第三方库Pandas(093)

见贤思齐 · 发表于 2024-9-10 03:41:31

目录一、用法精讲396、pandas.Series.to_frame方法396-1、语法396-2、参数396-3、功能396-4、返回值396-5、说明396-6、用法396-6-1、数据准备396-6-2、代码示例396-6-3、结果输出397、pandas.Series.to_xarray方法397-1、语法397-2、参数397-3、功能397-4、返回值397-5、说明397-6、用法397-6-1、数据准备397-6-2、代码示例397-6-3、结果输出398、pandas.Series.to_hdf方法398-1、语法398-2、参数398-3、功能398-4、返回值398-5、说明398-6、用法398-6-1、数据准备398-6-2、代码示例398-6-3、结果输出399、pandas.Series.to_sql方法399-1、语法399-2、参数399-3、功能399-4、返回值399-5、说明399-6、用法399-6-1、数据准备399-6-2、代码示例399-6-3、结果输出400、pandas.Series.to_json方法400-1、语法400-2、参数400-3、功能400-4、返回值400-5、说明400-6、用法400-6-1、数据准备400-6-2、代码示例400-6-3、结果输出二、推荐阅读1、Python筑基之旅2、Python函数之旅3、Python算法之旅4、Python魔法之旅5、博客个人主页一、用法精讲396、pandas.Series.to_frame方法396-1、语法#396、pandas.Series.to_frame方法pandas.Series.to_frame(name=_NoDefault.no_default)ConvertSeriestoDataFrame.Parameters:nameobject,optionalThepassednameshouldsubstitutefortheseriesname(ifithasone).Returns

ataFrameDataFramerepresentationofSeries.396-2、参数396-2-1、name(可选)：如果你没有指定name参数，pandas会使用Series自己的名字作为列名；如果Series没有名字，DataFrame中的列名将保持未命名状态。396-3、功能将一个pandasSeries转换为单列的DataFrame。396-4、返回值返回一个pandasDataFrame对象，这个DataFrame有一列，其值来自于原来的Series。396-5、说明无396-6、用法396-6-1、数据准备无396-6-2、代码示例#396、pandas.Series.to_frame方法#396-1、使用默认列名(Series自带名字）importpandasaspd#创建一个有名称的Seriess=pd.Series([1,2,3],name='my_series')#将Series转换为DataFramedf=s.to_frame()print(df)#396-2、指定列名importpandasaspd#创建一个没有名称的Seriess=pd.Series([1,2,3])#将Series转换为DataFrame，并指定列名df=s.to_frame(name='new_column')print(df)#396-3、Series没有名字，且不指定列名importpandasaspd#创建一个没有名称的Seriess=pd.Series([1,2,3])#将Series转换为DataFrame，不指定列名df=s.to_frame()print(df)396-6-3、结果输出#396、pandas.Series.to_frame方法#396-1、使用默认列名(Series自带名字）#my_series#01#12#23#396-2、指定列名#new_column#01#12#23#396-3、Series没有名字，且不指定列名#0#01#12#23397、pandas.Series.to_xarray方法397-1、语法#397、pandas.Series.to_xarray方法pandas.Series.to_xarray()Returnanxarrayobjectfromthepandasobject.Returns:xarray.DataArrayorxarray.DatasetDatainthepandasstructureconvertedtoDatasetiftheobjectisaDataFrame,oraDataArrayiftheobjectisaSeries.397-2、参数无397-3、功能将一个PandasSeries对象转换为一个xarrayDataArray对象397-4、返回值返回一个xarray.DataArray对象，该对象将包含原Series的数据及其索引。397-5、说明无397-6、用法397-6-1、数据准备无397-6-2、代码示例#397、pandas.Series.to_xarray方法#397-1、基本用法importpandasaspd#导入xarray库importxarrayasxr#创建一个PandasSeries对象s=pd.Series([10,20,30],index=['a','b','c'],name='example_series')#将Series转换为DataArraydata_array=s.to_xarray()print(data_array,end='\n\n')#397-2、实例分析importpandasaspd#需要有一个Series对象，假设它包含了一些时间序列数据dates=pd.date_range('20240101',periods=4)data=pd.Series([1.1,2.3,3.6,4.0],index=dates,name='data_series')#将Series转换为DataArraydata_array=data.to_xarray()print(data_array)397-6-3、结果输出#397、pandas.Series.to_xarray方法#397-1、基本用法#Size:24B#array([10,20,30],dtype=int64)#Coordinates:#*index(index)object24B'a''b''c'#397-2、实例分析#Size:32B#array([1.1,2.3,3.6,4.])#Coordinates:#*index(index)datetime64[ns]32B2024-01-012024-01-02...2024-01-04398、pandas.Series.to_hdf方法398-1、语法#398、pandas.Series.to_hdf方法pandas.Series.to_hdf(path_or_buf,*,key,mode='a',complevel=None,complib=None,append=False,format=None,index=True,min_itemsize=None,nan_rep=None,dropna=None,data_columns=None,errors='strict',encoding='UTF-8')WritethecontaineddatatoanHDF5fileusingHDFStore.HierarchicalDataFormat(HDF)isself-describing,allowinganapplicationtointerpretthestructureandcontentsofafilewithnooutsideinformation.OneHDFfilecanholdamixofrelatedobjectswhichcanbeaccessedasagrouporasindividualobjects.InordertoaddanotherDataFrameorSeriestoanexistingHDFfilepleaseuseappendmodeandadifferentakey.WarningOnecanstoreasubclassofDataFrameorSeriestoHDF5,butthetypeofthesubclassislostuponstoring.Formoreinformationseetheuserguide.Parameters:path_or_bufstrorpandas.HDFStoreFilepathorHDFStoreobject.keystrIdentifierforthegroupinthestore.mode{‘a’,‘w’,‘r+’},default‘a’Modetoopenfile:‘w’:write,anewfileiscreated(anexistingfilewiththesamenamewouldbedeleted).‘a’:append,anexistingfileisopenedforreadingandwriting,andifthefiledoesnotexistitiscreated.‘r+’:similarto‘a’,butthefilemustalreadyexist.complevel{0-9},defaultNoneSpecifiesacompressionlevelfordata.Avalueof0orNonedisablescompression.complib{‘zlib’,‘lzo’,‘bzip2’,‘blosc’},default‘zlib’Specifiesthecompressionlibrarytobeused.TheseadditionalcompressorsforBloscaresupported(defaultifnocompressorspecified:‘blosc:blosclz’):{‘blosc:blosclz’,‘blosc:lz4’,‘blosc:lz4hc’,‘blosc:snappy’,‘blosc:zlib’,‘blosc:zstd’}.SpecifyingacompressionlibrarywhichisnotavailableissuesaValueError.appendbool,defaultFalseForTableformats,appendtheinputdatatotheexisting.format{‘fixed’,‘table’,None},default‘fixed’Possiblevalues:‘fixed’:Fixedformat.Fastwriting/reading.Not-appendable,norsearchable.‘table’:Tableformat.WriteasaPyTablesTablestructurewhichmayperformworsebutallowmoreflexibleoperationslikesearching/selectingsubsetsofthedata.IfNone,pd.get_option(‘io.hdf.default_format’)ischecked,followedbyfallbackto“fixed”.indexbool,defaultTrueWriteDataFrameindexasacolumn.min_itemsizedictorint,optionalMapcolumnnamestominimumstringsizesforcolumns.nan_repAny,optionalHowtorepresentnullvaluesasstr.Notallowedwithappend=True.dropnabool,defaultFalse,optionalRemovemissingvalues.data_columnslistofcolumnsorTrue,optionalListofcolumnstocreateasindexeddatacolumnsforon-diskqueries,orTruetouseallcolumns.Bydefaultonlytheaxesoftheobjectareindexed.SeeQueryviadatacolumns.formoreinformation.Applicableonlytoformat=’table’.errorsstr,default‘strict’Specifieshowencodinganddecodingerrorsaretobehandled.Seetheerrorsargumentforopen()forafulllistofoptions.encodingstr,default“UTF-8”.398-2、参数398-2-1、path_or_buf(必须)：字符串或可类文件对象，指定要保存的HDF5文件的路径，如果文件不存在，将会自动创建一个新的文件；如果提供的是类文件对象，它需要支持write()方法。398-2-2、key(必须)：字符串，指定在HDF5文件中保存数据的节点路径，该路径在文件中将充当数据的唯一标识符，类似于文件系统中的文件路径。398-2-3、mode(可选，默认值为'a')：字符串，指定文件的打开模式，选项有：'r'：只读模式'w'：写模式，覆盖文件'a'：追加模式'r+'：读写模式398-2-4、complevel(可选，默认值为None)：整数，范围为0-9，指定压缩级别。级别越高，压缩率越大，但压缩和解压缩的速度也越慢，0表示不压缩。398-2-5、complib(可选，默认值为None)：字符串，指定压缩库，可以选择的值包括'zlib', 'lzo', 'bzip2', 'blosc'，压缩库决定了数据的压缩算法。398-2-6、append(可选，默认值为False)：布尔值，如果为True，数据将追加到现有的节点；如果为False，现有节点的数据将被覆盖。398-2-7、format(可选，默认值为None)：字符串，指定数据的存储格式，'fixed'表示固定格式(写入速度快但灵活性低)，'table'表示表格式(写入速度慢但更灵活)。398-2-8、index(可选，默认值为True)：布尔值，是否将索引保存到HDF5文件中。398-2-9、min_itemsize(可选，默认值为None)：字典或整数，指定字符串列的最小大小，如果需要存储很长的字符串，可以使用这个参数来指定最小的存储空间。398-2-10、nan_rep(可选，默认值为None)：字符串，用于替换NaN值的字符串表示。398-2-11、dropna(可选，默认值为None)：布尔值，如果为True，将不保存包含NaN的列。398-2-12、data_columns(可选，默认值为None)：列表，指定哪些列应该作为数据列进行索引，这在检索时可以加快查询速度。398-2-13、errors(可选，默认值为'strict')：字符串，指定在编码错误时的处理方式，'strict'会引发错误，'ignore'会忽略错误，'replace'会用替代字符替换错误。398-2-14、encoding(可选，默认值为'UTF-8')：字符串，指定在保存字符串数据时使用的编码。398-3、功能将PandasSeries对象保存到HDF5文件中，并且可以选择性地对数据进行压缩、指定存储格式等。398-4、返回值该方法没有返回值。它直接将数据写入到指定的HDF5文件中。398-5、说明无398-6、用法398-6-1、数据准备无398-6-2、代码示例#398、pandas.Series.to_hdf方法importpandasaspd#创建一个Series对象s=pd.Series([1,2,3],index=['a','b','c'])#将Series保存到HDF5文件中s.to_hdf('data.h5',key='series_data',mode='w',format='table')398-6-3、结果输出399、pandas.Series.to_sql方法399-1、语法#399、pandas.Series.to_sql方法pandas.Series.to_sql(name,con,*,schema=None,if_exists='fail',index=True,index_label=None,chunksize=None,dtype=None,method=None)WriterecordsstoredinaDataFrametoaSQLdatabase.DatabasessupportedbySQLAlchemy[1]aresupported.Tablescanbenewlycreated,appendedto,oroverwritten.Parameters:namestrNameofSQLtable.consqlalchemy.engine.(EngineorConnection)orsqlite3.ConnectionUsingSQLAlchemymakesitpossibletouseanyDBsupportedbythatlibrary.Legacysupportisprovidedforsqlite3.Connectionobjects.TheuserisresponsibleforenginedisposalandconnectionclosurefortheSQLAlchemyconnectable.Seehere.Ifpassingasqlalchemy.engine.Connectionwhichisalreadyinatransaction,thetransactionwillnotbecommitted.Ifpassingasqlite3.Connection,itwillnotbepossibletorollbacktherecordinsertion.schemastr,optionalSpecifytheschema(ifdatabaseflavorsupportsthis).IfNone,usedefaultschema.if_exists{‘fail’,‘replace’,‘append’},default‘fail’Howtobehaveifthetablealreadyexists.fail:RaiseaValueError.replace

ropthetablebeforeinsertingnewvalues.append:Insertnewvaluestotheexistingtable.indexbool,defaultTrueWriteDataFrameindexasacolumn.Usesindex_labelasthecolumnnameinthetable.Createsatableindexforthiscolumn.index_labelstrorsequence,defaultNoneColumnlabelforindexcolumn(s).IfNoneisgiven(default)andindexisTrue,thentheindexnamesareused.AsequenceshouldbegiveniftheDataFrameusesMultiIndex.chunksizeint,optionalSpecifythenumberofrowsineachbatchtobewrittenatatime.Bydefault,allrowswillbewrittenatonce.dtypedictorscalar,optionalSpecifyingthedatatypeforcolumns.Ifadictionaryisused,thekeysshouldbethecolumnnamesandthevaluesshouldbetheSQLAlchemytypesorstringsforthesqlite3legacymode.Ifascalarisprovided,itwillbeappliedtoallcolumns.method{None,‘multi’,callable},optionalControlstheSQLinsertionclauseused:None:UsesstandardSQLINSERTclause(oneperrow).‘multi’

assmultiplevaluesinasingleINSERTclause.callablewithsignature(pd_table,conn,keys,data_iter).Detailsandasamplecallableimplementationcanbefoundinthesectioninsertmethod.Returns:NoneorintNumberofrowsaffectedbyto_sql.Noneisreturnedifthecallablepassedintomethoddoesnotreturnanintegernumberofrows.Thenumberofreturnedrowsaffectedisthesumoftherowcountattributeofsqlite3.CursororSQLAlchemyconnectablewhichmaynotreflecttheexactnumberofwrittenrowsasstipulatedinthesqlite3orSQLAlchemy.Newinversion1.4.0.Raises:ValueErrorWhenthetablealreadyexistsandif_existsis‘fail’(thedefault).399-2、参数399-2-1、name(必须)：字符串，SQL表的名称，这是将要写入数据的目标表名，如果表名中有特殊字符或关键字，可能需要用引号括起来。399-2-2、con(必须)：一个数据库连接对象，表示要连接的数据库，常用的是SQLAlchemy引擎或SQLite的连接对象。399-2-3、schema(可选，默认值为None)：字符串，指定数据库中的模式(Schema)，如果未指定，默认使用数据库的默认模式。399-2-4、if_exists(可选，默认值为'fail')：字符串，当表已存在时，指定如何处理数据的写入：'fail'：引发错误，表示表已存在。'replace'：删除现有表并创建一个新表。'append'：将数据追加到现有表中。399-2-5、index(可选，默认值为True)：布尔值，是否将Series的索引作为列写入到SQL表中，如果为True，索引将作为表的一列进行存储。399-2-6、index_label(可选，默认值为None)：字符串或列表，指定索引列的列名，如果不指定，将使用Series索引的名称；如果Series的索引没有名称，将使用index作为列名。399-2-7、chunksize(可选，默认值为None)：整数，将数据分块写入数据库的每个块的大小，如果数据量很大，这个参数可以帮助避免内存问题。399-2-8、dtype(可选，默认值为None)：字典或None，指定列的数据类型。例如{'col_name':sqlalchemy.types.Integer}可以将列的数据类型指定为Integer，这对确保数据在SQL表中的类型一致性非常有用。399-2-9、method(可选，默认值为None)：字符串或可调用对象，指定用于插入数据的SQL语句的执行方法，可以是'multi'(执行多个值的单个SQL语句)或自定义函数来处理数据插入的逻辑，使用'multi'可以加快插入速度。399-3、功能将PandasSeries的数据插入到指定的SQL表中，如果表已经存在，可以选择追加数据或替换表，此方法对处理和分析数据特别有用，尤其是当数据需要长期存储或需要通过SQL进行复杂查询时。399-4、返回值此方法返回写入的行数(int类型)，表示成功插入到SQL表中的记录数量。399-5、说明无399-6、用法399-6-1、数据准备无399-6-2、代码示例#399、pandas.Series.to_sql方法importpandasaspdfromsqlalchemyimportcreate_engine#创建一个Series对象s=pd.Series([10,20,30],index=['a','b','c'])#创建一个数据库连接(例如,SQLite数据库)engine=create_engine('sqlite:///my_database.db')#将Series保存到SQL表中s.to_sql(name='my_table',con=engine,if_exists='replace',index=True)399-6-3、结果输出无400、pandas.Series.to_json方法400-1、语法#400、pandas.Series.to_json方法pandas.Series.to_json(path_or_buf=None,*,orient=None,date_format=None,double_precision=10,force_ascii=True,date_unit='ms',default_handler=None,lines=False,compression='infer',index=None,indent=None,storage_options=None,mode='w')ConverttheobjecttoaJSONstring.NoteNaN’sandNonewillbeconvertedtonullanddatetimeobjectswillbeconvertedtoUNIXtimestamps.Parameters:path_or_bufstr,pathobject,file-likeobject,orNone,defaultNoneString,pathobject(implementingos.PathLike[str]),orfile-likeobjectimplementingawrite()function.IfNone,theresultisreturnedasastring.orientstrIndicationofexpectedJSONstringformat.Series:defaultis‘index’allowedvaluesare:{‘split’,‘records’,‘index’,‘table’}.DataFrame:defaultis‘columns’allowedvaluesare:{‘split’,‘records’,‘index’,‘columns’,‘values’,‘table’}.TheformatoftheJSONstring:‘split’:dictlike{‘index’->[index],‘columns’->[columns],‘data’->[values]}‘records’:listlike[{column->value},…,{column->value}]‘index’:dictlike{index->{column->value}}‘columns’:dictlike{column->{index->value}}‘values’:justthevaluesarray‘table’:dictlike{‘schema’:{schema},‘data’:{data}}Describingthedata,wheredatacomponentislikeorient='records'.date_format{None,‘epoch’,‘iso’}Typeofdateconversion.‘epoch’=epochmilliseconds,‘iso’=ISO8601.Thedefaultdependsontheorient.Fororient='table',thedefaultis‘iso’.Forallotherorients,thedefaultis‘epoch’.double_precisionint,default10Thenumberofdecimalplacestousewhenencodingfloatingpointvalues.Thepossiblemaximalvalueis15.Passingdouble_precisiongreaterthan15willraiseaValueError.force_asciibool,defaultTrueForceencodedstringtobeASCII.date_unitstr,default‘ms’(milliseconds)Thetimeunittoencodeto,governstimestampandISO8601precision.Oneof‘s’,‘ms’,‘us’,‘ns’forsecond,millisecond,microsecond,andnanosecondrespectively.default_handlercallable,defaultNoneHandlertocallifobjectcannototherwisebeconvertedtoasuitableformatforJSON.Shouldreceiveasingleargumentwhichistheobjecttoconvertandreturnaserialisableobject.linesbool,defaultFalseIf‘orient’is‘records’writeoutline-delimitedjsonformat.WillthrowValueErrorifincorrect‘orient’sinceothersarenotlist-like.compressionstrordict,default‘infer’Foron-the-flycompressionoftheoutputdata.If‘infer’and‘path_or_buf’ispath-like,thendetectcompressionfromthefollowingextensions:‘.gz’,‘.bz2’,‘.zip’,‘.xz’,‘.zst’,‘.tar’,‘.tar.gz’,‘.tar.xz’or‘.tar.bz2’(otherwisenocompression).SettoNonefornocompression.Canalsobeadictwithkey'method'settooneof{'zip','gzip','bz2','zstd','xz','tar'}andotherkey-valuepairsareforwardedtozipfile.ZipFile,gzip.GzipFile,bz2.BZ2File,zstandard.ZstdCompressor,lzma.LZMAFileortarfile.TarFile,respectively.Asanexample,thefollowingcouldbepassedforfastercompressionandtocreateareproduciblegziparchive:compression={'method':'gzip','compresslevel':1,'mtime':1}.Newinversion1.5.0:Addedsupportfor.tarfiles.Changedinversion1.4.0:Zstandardsupport.indexboolorNone,defaultNoneTheindexisonlyusedwhen‘orient’is‘split’,‘index’,‘column’,or‘table’.Ofthese,‘index’and‘column’donotsupportindex=False.indentint,optionalLengthofwhitespaceusedtoindenteachrecord.storage_optionsdict,optionalExtraoptionsthatmakesenseforaparticularstorageconnection,e.g.host,port,username,password,etc.ForHTTP(S)URLsthekey-valuepairsareforwardedtourllib.request.Requestasheaderoptions.ForotherURLs(e.g.startingwith“s3://”,and“gcs://”)thekey-valuepairsareforwardedtofsspec.open.Pleaseseefsspecandurllibformoredetails,andformoreexamplesonstorageoptionsreferhere.modestr,default‘w’(writing)SpecifytheIOmodeforoutputwhensupplyingapath_or_buf.Acceptedargsare‘w’(writing)and‘a’(append)only.mode=’a’isonlysupportedwhenlinesisTrueandorientis‘records’.Returns:NoneorstrIfpath_or_bufisNone,returnstheresultingjsonformatasastring.OtherwisereturnsNone.400-2、参数400-2-1、path_or_buf(必须)：字符串或文件对象，指定JSON数据的输出路径，如果为None，则返回JSON格式的字符串；否则，将JSON数据写入到指定的文件或缓冲区。400-2-2、orient(可选，默认值为None)：字符串，确定JSON输出格式，可选值有：'split'：输出{"index"->[index],"data"->[values]}。'records'：输出 [{"index":index,"data":value},...]。'index'：输出 {index->value}。'values'：输出 [value,...]。'table'：输出JSONTableSchema格式。400-2-3、date_format(可选，默认值为None)：字符串，指定日期格式，可选值为'epoch'(时间戳)或'iso'(ISO8601日期格式)，如果为None，则使用默认的Pandas日期处理方式。400-2-4、double_precision(可选，默认值为10)：整数，控制浮点数的精度，表示小数点后的有效位数。400-2-5、force_ascii(可选，默认值为True)：布尔值，如果为True，则所有非ASCII字符将被转义，设置为False可以保留原始Unicode字符。400-2-6、date_unit(可选，默认值为'ms')：字符串，指定时间单位，可选值有's'(秒)、'ms'(毫秒)、'us'(微秒)、'ns'(纳秒)。400-2-7、default_handler(可选，默认值为None)：可调用对象，用于处理无法直接序列化为JSON的对象，可以传递一个自定义函数，以确定如何处理这些对象。400-2-8、lines(可选，默认值为False)：布尔值，如果为True，每一行将作为一条JSON记录输出，这对于处理大文件非常有用。400-2-9、compression(可选，默认值为'infer')：字符串或None，指定压缩方式，可选值包括'gzip'、'bz2'、'zip'、'xz'、'zstd'或None，使用'infer'时，将根据文件扩展名自动推断压缩方式。400-2-10、index(可选，默认值为None)：布尔值，指定是否包含Series的索引，如果为None，则默认为True。400-2-11、indent(可选，默认值为None)：整数，设置JSON输出的缩进级别，None表示没有缩进，JSON将输出为紧凑格式。400-2-12、storage_options(可选，默认值为None)：字典，传递给文件系统的额外存储选项(例如使用S3、GCS的配置参数)。400-2-13、mode(可选，默认值为'w')：字符串，文件的写入模式，常用模式包括'w'(写入)、'a'(追加)。400-3、功能将Series对象序列化为JSON格式，它提供了多种参数，以灵活地控制输出的格式、精度和其他细节，可以选择将结果写入文件或返回为JSON字符串。400-4、返回值返回一个包含Series数据的JSON格式字符串，如果指定了path_or_buf，则将JSON写入文件，并返回None。400-5、说明无400-6、用法400-6-1、数据准备无400-6-2、代码示例#400、pandas.Series.to_json方法importpandasaspd#创建一个Series对象s=pd.Series([1,2,3],index=['a','b','c'])#将Series转换为JSON字符串json_str=s.to_json()#将Series保存为JSON文件s.to_json('output.json',orient='split',indent=4)400-6-3、结果输出二、推荐阅读1、Python筑基之旅2、Python函数之旅3、Python算法之旅4、Python魔法之旅5、博客个人主页

		自动登录	找回密码
密码			会员注册