我正在阅读一个没有标题的管道分隔文件,我正在使用 Pandas 0.24.2 版。这是公开数据,所以不用担心机密性。
数据如下:
999778247820|R|JPMORGAN CHASE BANK, NATIONAL ASSOCIATION|7.375|113000|360|02/2001|04/2001|95|95|1|52|665|Y|P|SF|1|P|IL|601|30|FRM||1|N
999783196683|R|OTHER|7.25|59000|360|01/2001|04/2001|97|97|2|43|682|Y|P|PU|1|P|HI|967|30|FRM|676|1|N
999783470376|C|BANK OF AMERICA, N.A.|7.875|110000|360|12/2000|02/2001|74|74|2|26|700|N|P|SF|1|P|NY|125||FRM|698||N
999786911479|C|BANK OF AMERICA, N.A.|7.5|57000|360|12/2000|02/2001|90|90|1|28|699|N|P|SF|1|P|TX|781|25|FRM||1|N
999786913710|R|JPMORGAN CHASE BANK, NA|7.125|114000|360|01/2001|04/2001|73|73|2|16|745|N|C|SF|1|P|WA|992||FRM|||N
999788833695|B|OTHER|9|50000|360|10/2000|12/2000|90|90|2|40|674|N|P|SF|2|I|WI|535|25|FRM|737|1|N
这是我正在使用的代码:
orig_files_fnma = glob.glob("/...1/Acquisition*.txt")
col_names = ["loan_id", "origination_channel","seller_name","original_interest_rate","original_upb","original_loan_term","origination_date","first_payment_date","original_ltv","original_cltv","number_of_borrowers","original_dti",
"borrower_fico_at_origination","first_time_home_buyer_indicator", "loan_purpose","property_type","number_of_units","occupancy_type","property_state","zip_code_short","primary_mortgage_insurance_percent",
"product_type","coborrower_fico_at_origination","mortgage_insurance_type","relocation_mortgage_indicator"]
总是出现以下错误:
Filled 1 NA values in column original_ltv
Filled 52 NA values in column original_cltv
ValueError: Unable to convert column number_of_borrowers to type int
我确实发现我是否没有预先定义 dtype 和 .astype 以在加载后更改数据类型。但是问我是否可以像上面的代码一样先预定义数据类型。
另外,我想将对象的长度定义为 20 长度。这样做的正确代码是什么?
慕桂英3389331
相关分类