0

I have a workflow where output from one process is input to the next.

Process A outputs a JSON.

Process B inputs needs to be a JSON.

However, since I pass the JSON as a command-line argument, it becomes a string.

This command below is not in my control. It is autogenerated by Nextflow and so I need to find a solution (need not be JSON) but I need to access these values (keeping in mind this is essentially just a string)

python3.7 typing.py '{'id': '3283', 'code': '1234', 'task': '66128b3b-3440-4f71-9a6b-c788bc9f5d2c'}'

typing.py

def download_this(task_as_string):
    print("Normal")
    print(task_as_string)

    first = json.dumps(task_as_string)
    print("after json.dumps")
    print(first)

    second = json.loads(first)
    print("after json.loads")
    print(second)
    print(type(second))

if __name__ == "__main__":
    download_this(sys.argv[1])

I thought doing a json.dumps and then a json.loads would make it work, but it does not work.

Output

Normal
{id: 3283, code: 1234, task: 66128b3b-3440-4f71-9a6b-c788bc9f5d2c}
after json.dumps
"{id: 3283, code: 1234, task: 66128b3b-3440-4f71-9a6b-c788bc9f5d2c}"
after json.loads
{id: 3283, code: 1234, task: 66128b3b-3440-4f71-9a6b-c788bc9f5d2c}
<class 'str'>

And if I do print(second["task"]) I get a string indices must be integers

Traceback (most recent call last):
  File "typing.py", line 78, in <module>
    download_this(sys.argv[1])
  File "typing.py", line 55, in download_typed_hla
    print(second["task"])    
TypeError: string indices must be integers

So it was never converted to a dict in the first place. Any ideas how I can get around this problem?

8
  • Looks like you have things the wrong way round. If the item is a string, you need json.loads, not json.dumps. Commented Oct 18, 2019 at 14:29
  • Not quite, because {id: 3283, code: 1234, task: 66128b3b-3440-4f71-9a6b-c788bc9f5d2c} JSON needs strings to be in double quotes. Since there are no double quotes, it doesn't recognise it as a dict. Commented Oct 18, 2019 at 14:30
  • How about the thing discussed here: stackoverflow.com/q/34812821/4636715 Commented Oct 18, 2019 at 14:32
  • Wat? Your task_as_string is a string. You dump it to a string-in-a-string. You then load it again to be just-a-string. It's never a dict. You're not passing valid JSON in to start with, so you can't handle it as JSON. Dumping it to JSON doesn't improve that situation. Commented Oct 18, 2019 at 14:33
  • JSON is always a string. json.loads turns a string into a Python object (this includes turning a string like '"foo"' in the Python str object 'foo'). Commented Oct 18, 2019 at 14:38

1 Answer 1

3

A couple things:

  1. Your JSON is not properly formatted. Keys and values need to be enclosed by double quotes.
  2. You are passing in a stringified version of the JSON. Then you stringify it further before trying to load it. Just load it directly.
def download_this(task_as_string):
    print("Normal")
    print(task_as_string)

    second = json.loads(task_as_string)
    print("after json.loads")
    print(second)
    print(type(second))

download_this('{"id": "3283", "code": "1234", "task": "66128b3b-3440-4f71-9a6b-c788bc9f5d2c"}')

Normal
{"id": "3283", "code": "1234", "task": "66128b3b-3440-4f71-9a6b-c788bc9f5d2c"}
after json.loads
{'id': '3283', 'code': '1234', 'task': '66128b3b-3440-4f71-9a6b-c788bc9f5d2c'}
<class 'dict'>

To get around your input problem, provided that you trust the input from Nextflow to conform to a simple dictionary-like structure, you could do something like this:

d = dict()
for group in task_as_string.replace('{', '').replace('}', '').split(','):
    l = group.split(':')
    d[l[0].strip()] = l[1].strip()

print(d)
print(type(d))
python3 typing.py '{'id': '3283', 'code': '1234', 'task': '66128b3b-3440-4f71-9a6b-c788bc9f5d2c'}'                      [12:03:11]
{'id': '3283', 'code': '1234', 'task': '66128b3b-3440-4f71-9a6b-c788bc9f5d2c'}
<class 'dict'>

If the JSON coming from Nextflow is more complicated (i.e. with nesting and/or lists), then you'll have to come up with a more suitable parsing mechanism.

Sign up to request clarification or add additional context in comments.

2 Comments

yes, the workflow I'm using "Nextflow" handles the input/output. So Nextflow inputs it the way I've shown, without quotes. So I need to figure out a way to access the data from python3.7 typing.py '{'id': '3283', 'code': '1234', 'task': '66128b3b-3440-4f71-9a6b-c788bc9f5d2c'}'
@daudnadeem I updated the answer to parse the string into a dict.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.