Unicode로 Panda DataFrame을 JSON에 쓰기

programing

Unicode로 Panda DataFrame을 JSON에 쓰기

newnotes 2023. 3. 26. 11:45

Unicode로 Panda DataFrame을 JSON에 쓰기

유니코드를 포함한 팬더 데이터 프레임을 json에 쓰려고 하는데.to_json함수가 문자를 이스케이프합니다.이거 어떻게 고쳐야 돼요?

예:

import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json')

그 결과, 다음과 같이 됩니다.

{"0":{"0":"\u03c4","1":"\u03c0"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

원하는 결과와 다릅니다.

{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

I have tried adding the force_ascii=False argument:

import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json', force_ascii=False)

그러나 이로 인해 다음과 같은 오류가 발생합니다.

UnicodeEncodeError: 'charmap' codec can't encode character '\u03c4' in position 11: character maps to <undefined>

I'm using WinPython 3.4.4.2 64bit with pandas 0.18.0

인코딩이 utf-8로 설정된 파일을 열고 해당 파일을 에 전달합니다..to_json함수는 문제를 해결합니다.

with open('df.json', 'w', encoding='utf-8') as file:
    df.to_json(file, force_ascii=False)

는 다음과 같은 정답을 참조해 주세요.

{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

주의: 이 경우에도force_ascii=False논쟁.

같은 일을 하는 다른 방법도 있다.JSON은 키(큰따옴표 안의 문자열)와 값(문자열, 숫자, 중첩된 JSON 또는 배열)으로 구성되며 Python 사전과 매우 유사하므로 간단한 변환 및 문자열 연산을 사용하여 Panda DataFrame에서 JSON을 가져올 수 있습니다.

import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])

# convert index values to string (when they're something else - JSON requires strings for keys)
df.index = df.index.map(str)
# convert column names to string (when they're something else - JSON requires strings for keys)
df.columns = df.columns.map(str)

# convert DataFrame to dict, dict to string and simply jsonify quotes from single to double quotes  
js = str(df.to_dict()).replace("'", '"')
print(js) # print or write to file or return as REST...anything you want

출력:

{"0": {"0": "τ", "1": "π"}, "1": {"0": "a", "1": "b"}, "2": {"0": 1, "1": 2}}

업데이트: @Swier(감사합니다)의 노트에 따르면 원래 데이터 프레임에 큰따옴표가 포함된 문자열에 문제가 있을 수 있습니다. df.jsonify()(즉, 탈출할 수 있습니다)'"a"'생산하다"\\"a\\""(JSON 형식)스트링 어프로치의 소규모 갱신을 통해 이 문제를 해결할 수도 있습니다.완전한 예:

import pandas as pd

def run_jsonifier(df):
    # convert index values to string (when they're something else)
    df.index = df.index.map(str)
    # convert column names to string (when they're something else)
    df.columns = df.columns.map(str)

    # convert DataFrame to dict and dict to string
    js = str(df.to_dict())
    #store indices of double quote marks in string for later update
    idx = [i for i, _ in enumerate(js) if _ == '"']
    # jsonify quotes from single to double quotes  
    js = js.replace("'", '"')
    # add \ to original double quotes to make it json-like escape sequence 
    for add, i in enumerate(idx):
        js = js[:i+add] + '\\' + js[i+add:] 
    return js

# define double-quotes-rich dataframe
df = pd.DataFrame([['τ', '"a"', 1], ['π', 'this" breaks >>"<""< ', 2]])

# run our function to convert dataframe to json
print(run_jsonifier(df))
# run original `to_json()` to see difference
print(df.to_json())

출력:

{"0": {"0": "τ", "1": "π"}, "1": {"0": "\"a\"", "1": "this\" breaks >>\"<\"\"< "}, "2": {"0": 1, "1": 2}}
{"0":{"0":"\u03c4","1":"\u03c0"},"1":{"0":"\"a\"","1":"this\" breaks >>\"<\"\"< "},"2":{"0":1,"1":2}}

언급URL : https://stackoverflow.com/questions/39612240/writing-pandas-dataframe-to-json-in-unicode

'programing' 카테고리의 다른 글

JDBC ResultSet 테이블 별칭이 있는 열을 가져옵니다. (0)	2023.03.26
새 범위를 만들지 않고 지시 템플릿으로 변수 전달 (0)	2023.03.26
Spring Rest 컨트롤러의 부분 업데이트에 대한 null 값과 제공되지 않은 값을 구별하는 방법 (0)	2023.03.26
컴포넌트로부터의 윈도 이벤트 청취에 관한 React.js 베스트 프랙티스 (0)	2023.03.26
메모리 DB를 사용한 스프링 부트 테스트 (0)	2023.03.26

현재글Unicode로 Panda DataFrame을 JSON에 쓰기

각종 프로그래밍 정보를 다루는 블로그입니다.

ios, ReactJS, MariaDB, Wordpress, JSON, CSS, Android, spring-boot, PowerShell, bash, WPF, Ajax, TypeScript, excel, azure, MongoDB, AngularJS, oracle, spring, JQuery,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

newnote

Unicode로 Panda DataFrame을 JSON에 쓰기

Unicode로 Panda DataFrame을 JSON에 쓰기

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

Unicode로 Panda DataFrame을 JSON에 쓰기

Unicode로 Panda DataFrame을 JSON에 쓰기

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바