0

I am trying to create a JSON object and appending it to a list but with no success. I got this error massage with:

Traceback (most recent call last):
  File "/projects/circos/test.py", line 32, in <module>
    read_relationship('data/chr03_small_n10.blast')
  File "/projects/circos/test.py", line 20, in read_relationship
    tmp = ("[source: {id: '{}',start: {},end: {}},target: {id: '{}',start: {}, end: {}}],").format(parts[0],parts[2],parts[3],parts[1],parts[4],parts[5])
KeyError: 'id'

with the following code

def read_relationship(filename):
    data = []
    with open(filename) as f:
        f.next()
        for line in f:
            try:
                parts = line.rstrip().split('\t')
                query_name = parts[0]
                subject_name = parts[1]
                query_start = parts[2]
                query_end = parts[3]
                subject_start = parts[4]
                subject_end = parts[5]


                # I need: [source: {id: 'Locus_1', start: 1, end: 1054}, target: {id: 'tig00007234', start: 140511, end: 137383}],
                tmp = ("[source: {id: '{}',start: {},end: {}},target: {id: '{}',start: {}, end: {}}],").format(parts[0],parts[2],parts[3],parts[1],parts[4],parts[5])
                data.append(tmp)

            except ValueError:
                pass

    with open('data/data.txt', 'w') as outfile:
        json.dump(data, outfile)


read_relationship('data/chr03_small_n10.blast')

What did I miss?

1
  • 2
    Is that supposed to be part of the JSON document? Because it'll be dumped as an opaque string value. Not that it could ever be seen as valid JSON. Commented Oct 28, 2017 at 21:03

2 Answers 2

5

You are using json.dump() function wrong.

You pass an object and a file object:

json.dump(object, fileobject)

Use dict for key value mapping:

def read_relationship(filename):
    data = []
    with open(filename) as f:
        f.next()
        for line in f:
            try:
                parts = line.rstrip().split('\t')
                query_name = parts[0]
                subject_name = parts[1]
                query_start = parts[2]
                query_end = parts[3]
                subject_start = parts[4]
                subject_end = parts[5]

                # use dict here
                item = {
                    'source': {
                        'id': query_name,
                        'start': subject_name,
                        'end': query_start
                },
                    'target': {
                        'id': query_end,
                        'start': subject_start,
                        'end': subject_end
                    }
                }
                data.append(item)

            except ValueError:
                pass

    with open('data/data.txt', 'w') as outfile:
        json.dump(data, outfile)


read_relationship('data/chr03_small_n10.blast')
Sign up to request clarification or add additional context in comments.

4 Comments

Rather to get [{"source": {" would it be possible to get [["source": {" because I need it for this github.com/nicgirault/circosJS#chords?
You can not put a key-value pair inside a list/array. It would be an invalid JSON string. Please stop formatting JSON string by yourself.
You right, how is it possible to write each list element in a new line?
Create another question with "How to pretty print JSON string in file?" Do not stress this question anymore.
2

You need to double the { and } characters that are not placeholders; {id:...} is seen as a named placeholder otherwise:

tmp = (
    "[source: {{id: '{}',start: {},end: {}}},"
    "target: {{id: '{}',start: {}, end: {}}}],").format(
        parts[0], parts[2], parts[3], parts[1], parts[4], parts[5])

The {{ and }} sequences end up as single { and } characters in the result.

Rather than put all your parts in separately, use numbered slots:

tmp = (
    "[source: {{id: '{0}',start: {2},end: {3}}},"
    "target: {{id: '{1}',start: {4}, end: {5}}}],").format(
        *parts)

You should consider using the csv module to read your TSV data, and if you meant for the above data to be part of the JSON document (not as embedded string but as separate JSON arrays and objects), then formatting it as a string won't work.

You'll need to convert your CSV columns to integers first though:

import csv
import json

def read_relationship(filename):
    data = []
    with open(filename, 'rb') as f:
        reader = csv.reader(f, delimiter='\t')
        next(reader, None)
        for row in reader:
            data.append([{
                'source': {
                    'id': row[0],
                    'start': int(row[2]),
                    'end': int(row[3]),
                },
                'target': {
                    'id': row[1],
                    'start': int(row[4]),
                    'end': int(row[5]),
                },
            }])

    with open('data/data.txt', 'w') as outfile:
        json.dump(data, outfile)

3 Comments

Why are you inserting object inside a list. What use would make using [{source:{}}]?
@ElisByberi because that's what the string the OP was building contains. It's trivially dropped if not needed. Unfortunately the OP has not given and expected output or documentation on what they tried to produce.
Yes, I saw. This question is off-topic anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.