1

I understand that serialization of data means converting a data structure or object state to a form which can be stored in a file or buffer, can be transmitted, and can be reconstructed later (https://www.tutorialspoint.com/object_oriented_python/object_oriented_python_serialization.htm). Based on this definition, converting a numpy array to .npy format should be serialization of the numpy array data object. However, I could not find this assertion anywhere, when I looked up on the internet. Most of the related links were mentioning about how pickle format does serialization of data in python. My questions is - is converting numpy array to .npz format an example of serialization of a python data object. If not, what are the reasons?

2 Answers 2

1

Well, according to Wikipedia:

In computing, serialization (or serialisation) is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer) or transmitted (for example, across a network connection link) and reconstructed later (possibly in a different computer environment).

And according to Numpy Doc:

Binary serialization

NPY format A simple format for saving numpy arrays to disk with the full information about them.

The .npy format is the standard binary file format in NumPy for persisting a single arbitrary NumPy array on disk. The format stores all of the shape and dtype information necessary to reconstruct the array correctly even on another machine with a different architecture.

The .npz format is the standard format for persisting multiple NumPy arrays on disk. A .npz file is a zip file containing multiple .npy files, one for each array.

So, putting this definitions together you can come up with an answer to your question. Yes, is a way of serialization. Also the process of storing and reading is fast

Sign up to request clarification or add additional context in comments.

Comments

0

np.save(filename, arr) writes the array to a file. Since a file a linear structure it is a form of serialization. But often 'serialization' refers to creating a string that can be sent to a database or over some 'pipeline'. I think you can save to a string buffer, but it takes a bit of trickery.

But in Python most objects have a pickle method, which creates a string which can be written to a file. In that sense pickle is a 2 step process - serialize and then write to file. The pickle for a numpy array is actually a save compatible form. (conversely, np.save of a non-array object uses that object's pickle).

savez writes a zip archive, containing one npy file for each array. It may in addition be compressed. There are OS tools for transferring zip archives to other computers.

1 Comment

From your answer I understand that np.save() results in serialization or array, though reluctantly, and np.savez is also a serialization process as it also writes the numpy arrays to a file. Am I correct? I am still not sure about the statements after reading your answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.