Python酷库之旅-第三方库Pandas(079)

见贤思齐 · 发表于 2024-9-10 03:06:54

目录一、用法精讲326、pandas.Series.str.normalize方法326-1、语法326-2、参数326-3、功能326-4、返回值326-5、说明326-6、用法326-6-1、数据准备326-6-2、代码示例326-6-3、结果输出327、pandas.Series.str.pad方法327-1、语法327-2、参数327-3、功能327-4、返回值327-5、说明327-6、用法327-6-1、数据准备327-6-2、代码示例327-6-3、结果输出328、pandas.Series.str.partition方法328-1、语法328-2、参数328-3、功能328-4、返回值328-5、说明328-6、用法328-6-1、数据准备328-6-2、代码示例328-6-3、结果输出329、pandas.Series.str.removeprefix方法329-1、语法329-2、参数329-3、功能329-4、返回值329-5、说明329-6、用法329-6-1、数据准备329-6-2、代码示例329-6-3、结果输出330、pandas.Series.str.removesuffix方法330-1、语法330-2、参数330-3、功能330-4、返回值330-5、说明330-6、用法330-6-1、数据准备330-6-2、代码示例330-6-3、结果输出二、推荐阅读1、Python筑基之旅2、Python函数之旅3、Python算法之旅4、Python魔法之旅5、博客个人主页一、用法精讲326、pandas.Series.str.normalize方法326-1、语法#326、pandas.Series.str.normalize方法pandas.Series.str.normalize(form)ReturntheUnicodenormalformforthestringsintheSeries/Index.Formoreinformationontheforms,seetheunicodedata.normalize().Parameters:form{‘NFC’,‘NFKC’,‘NFD’,‘NFKD’}Unicodeform.Returns:Series/Indexofobjects.326-2、参数326-2-1、form(必须)：指定了规范化的形式，可以选择以下四种形式：'NFC'：NormalizationFormC(CanonicalComposition)，规范化形式C，将分解的字符组合成一个字符。比如，将"é"和"é"规范化为"é"。'NFD'：NormalizationFormD(CanonicalDecomposition)，规范化形式D，将字符分解为其基础字符和组合标记。比如，将"é"分解为"e"和"́"。'NFKC'：NormalizationFormKC(CompatibilityComposition)，兼容性组合，将兼容性等价的字符组合到一起，同时执行NFC规范化。'NFKD'：NormalizationFormKD(CompatibilityDecomposition)，兼容性分解，将字符分解为其兼容性等价的基础字符和组合标记。326-3、功能对字符串进行规范化处理，确保字符序列的唯一性，它对于处理来自不同来源的数据、统一字符串格式、提高字符串比较的一致性非常有用。326-4、返回值返回一个新的pandas.Series对象，其中每个字符串都经过指定形式的规范化处理。326-5、说明无326-6、用法326-6-1、数据准备无326-6-2、代码示例#326、pandas.Series.str.normalize方法importpandasaspd#示例数据data=pd.Series(['café','e\u0301clair','cafe\u0301'])#使用NFC进行规范化normalized_data=data.str.normalize('NFC')print(normalized_data)#使用NFD进行规范化normalized_data=data.str.normalize('NFD')print(normalized_data)#使用NFKC进行规范化normalized_data=data.str.normalize('NFKC')print(normalized_data)#使用NFKD进行规范化normalized_data=data.str.normalize('NFKD')print(normalized_data)326-6-3、结果输出#326、pandas.Series.str.normalize方法#0café#1éclair#2café#dtype

bject#0café#1éclair#2café#dtype

bject#0café#1éclair#2café#dtype

bject#0café#1éclair#2café#dtype

bject327、pandas.Series.str.pad方法327-1、语法#327、pandas.Series.str.pad方法pandas.Series.str.pad(width,side='left',fillchar='')PadstringsintheSeries/Indexuptowidth.Parameters:widthintMinimumwidthofresultingstring;additionalcharacterswillbefilledwithcharacterdefinedinfillchar.side{‘left’,‘right’,‘both’},default‘left’Sidefromwhichtofillresultingstring.fillcharstr,default‘‘Additionalcharacterforfilling,defaultiswhitespace.Returns:SeriesorIndexofobjectReturnsSeriesorIndexwithminimumnumberofcharinobject.327-2、参数327-2-1、width(必须)：整数，用于定义字符串在填充后的总宽度，如果字符串的长度小于这个宽度，将会在指定方向填充字符，使其达到指定宽度。327-2-2、side(可选，默认值为'left')：指定填充的方向，选项有：'left'：在字符串的左侧进行填充。'right'：在字符串的右侧进行填充。'both'：在字符串的两侧进行填充，如果需要在两侧填充，但总宽度不均匀，多余的填充字符会放在右侧。327-2-3、fillchar(可选，默认值为'')：字符串，用于填充的字符，该字符必须是单个字符长度的字符串。327-3、功能将字符串填充到指定的宽度，这对于对齐文本或格式化输出非常有用，根据需要，您可以选择在字符串的左侧、右侧或两侧添加填充字符。327-4、返回值返回一个新的pandas.Series对象，其中每个字符串都经过了指定方向和填充字符的处理，长度达到了指定的宽度。327-5、说明无327-6、用法327-6-1、数据准备无327-6-2、代码示例#327、pandas.Series.str.pad方法importpandasaspd#示例数据data=pd.Series(['cat','dog','elephant'])#在左侧填充，使每个字符串的长度达到10，填充字符为'*'padded_left=data.str.pad(width=10,side='left',fillchar='*')#在右侧填充，使每个字符串的长度达到10，填充字符为'-'padded_right=data.str.pad(width=10,side='right',fillchar='-')#在两侧填充，使每个字符串的长度达到10，填充字符为'~'padded_both=data.str.pad(width=10,side='both',fillchar='~')print("LeftPadded:\n",padded_left)print("RightPadded:\n",padded_right)print("BothSidesPadded:\n",padded_both)327-6-3、结果输出#327、pandas.Series.str.pad方法#LeftPadded:#0*******cat#1*******dog#2**elephant#dtype

bject#RightPadded:#0cat-------#1dog-------#2elephant--#dtype

bject#BothSidesPadded:#0~~~cat~~~~#1~~~dog~~~~#2~elephant~#dtype

bject328、pandas.Series.str.partition方法328-1、语法#328、pandas.Series.str.partition方法pandas.Series.str.partition(sep='',expand=True)Splitthestringatthefirstoccurrenceofsep.Thismethodsplitsthestringatthefirstoccurrenceofsep,andreturns3elementscontainingthepartbeforetheseparator,theseparatoritself,andthepartaftertheseparator.Iftheseparatorisnotfound,return3elementscontainingthestringitself,followedbytwoemptystrings.Parameters:sepstr,defaultwhitespaceStringtospliton.expandbool,defaultTrueIfTrue,returnDataFrame/MultiIndexexpandingdimensionality.IfFalse,returnSeries/Index.Returns

ataFrame/MultiIndexorSeries/Indexofobjects.328-2、参数328-2-1、sep(可选，默认值为'')：字符串，用于分割字符串的分隔符，该分隔符可以是任意字符或字符串。如果字符串中没有找到指定的分隔符，那么结果将包含原字符串，并且中间和右侧的结果为空字符串。328-2-2、expand(可选，默认值为True)：布尔值，指定返回值的形式。如果为True，方法将返回一个DataFrame，其中包含三列，分别对应分隔符前的部分、分隔符本身、分隔符后的部分。如果为False，方法将返回一个Series，其中每个元素是一个包含这三部分的元组(before，sep，after)。328-3、功能通过指定的分隔符将每个字符串分为三部分，该方法非常适合用于处理包含特定分隔符的字符串，帮助我们快速获取分隔符两侧的内容。328-4、返回值根据expand参数的值，该方法有两种不同的返回值：当expand=True时，返回一个DataFrame，每列分别表示分隔符前的部分、分隔符本身、分隔符后的部分。当expand=False时，返回一个Series，其中每个元素是一个(before，sep，after)的元组。328-5、说明无328-6、用法328-6-1、数据准备无328-6-2、代码示例#328、pandas.Series.str.partition方法importpandasaspd#示例数据data=pd.Series(['apple-pie','banana-split','cherry'])#使用'-'作为分隔符进行分割，expand=True，返回DataFramepartitioned_df=data.str.partition(sep='-',expand=True)#使用'-'作为分隔符进行分割，expand=False，返回Seriespartitioned_series=data.str.partition(sep='-',expand=False)print("PartitionedDataFrame:\n",partitioned_df)print("PartitionedSeries:\n",partitioned_series)328-6-3、结果输出#328、pandas.Series.str.partition方法#PartitionedDataFrame:#012#0apple-pie#1banana-split#2cherry#PartitionedSeries:#0(apple,-,pie)#1(banana,-,split)#2(cherry,,)#dtype

bject329、pandas.Series.str.removeprefix方法329-1、语法#329、pandas.Series.str.removeprefix方法pandas.Series.str.removeprefix(prefix)Removeaprefixfromanobjectseries.Iftheprefixisnotpresent,theoriginalstringwillbereturned.Parameters:prefixstrRemovetheprefixofthestring.Returns:Series/Index

bjectTheSeriesorIndexwithgivenprefixremoved.329-2、参数329-2-1、prefix(必须)：字符串，指定要移除的前缀，如果字符串的开头部分与prefix匹配，那么该部分将被移除；如果字符串不以prefix开头，则字符串保持不变。329-3、功能从每个字符串的开头移除指定的前缀，该方法特别适用于清理数据时，需要删除统一的开头标识符或固定格式的前缀。329-4、返回值返回一个新的Series，其中每个字符串都已经移除了指定的前缀，如果原始字符串不包含指定的前缀，则返回的字符串与原字符串相同。329-5、说明无329-6、用法329-6-1、数据准备无329-6-2、代码示例#329、pandas.Series.str.removeprefix方法importpandasaspd#示例数据data=pd.Series(['prefix_text1','prefix_text2','no_prefix_text'])#使用'removeprefix'方法移除前缀'prefix_'removed_prefix=data.str.removeprefix('prefix_')print("OriginalSeries:\n",data)print("Seriesafterremovingprefix:\n",removed_prefix)329-6-3、结果输出#329、pandas.Series.str.removeprefix方法#OriginalSeries:#0prefix_text1#1prefix_text2#2no_prefix_text#dtype

bject#Seriesafterremovingprefix:#0text1#1text2#2no_prefix_text#dtype:object330、pandas.Series.str.removesuffix方法330-1、语法#330、pandas.Series.str.removesuffix方法pandas.Series.str.removesuffix(suffix)Removeasuffixfromanobjectseries.Ifthesuffixisnotpresent,theoriginalstringwillbereturned.Parameters:suffixstrRemovethesuffixofthestring.Returns:Series/Index:objectTheSeriesorIndexwithgivensuffixremoved.330-2、参数330-2-1、suffix(必须)：字符串，指定要移除的后缀，如果字符串的结尾部分与suffix匹配，那么该部分将被移除；如果字符串不以suffix结尾，则字符串保持不变。330-3、功能从每个字符串的结尾移除指定的后缀，该方法特别适用于清理数据时，需要删除统一的结尾标识符或固定格式的后缀。330-4、返回值返回一个新的Series，其中每个字符串都已经移除了指定的后缀，如果原始字符串不包含指定的后缀，则返回的字符串与原字符串相同。330-5、说明无330-6、用法330-6-1、数据准备无330-6-2、代码示例#330、pandas.Series.str.removesuffix方法importpandasaspd#示例数据data=pd.Series(['text1_suffix','text2_suffix','text3_nosuffix'])#使用'removesuffix'方法移除后缀'_suffix'removed_suffix=data.str.removesuffix('_suffix')print("OriginalSeries:\n",data)print("Seriesafterremovingsuffix:\n",removed_suffix)330-6-3、结果输出#330、pandas.Series.str.removesuffix方法#OriginalSeries:#0text1_suffix#1text2_suffix#2text3_nosuffix#dtype:object#Seriesafterremovingsuffix:#0text1#1text2#2text3_nosuffix#dtype:object二、推荐阅读1、Python筑基之旅2、Python函数之旅3、Python算法之旅4、Python魔法之旅5、博客个人主页

		自动登录	找回密码
密码			会员注册