I am going to build a "simple" RESTful web service with PHP. I will provide APIs to access some data (via JSON) I collect on my web server. The main data table will be read-only for public API methods, and will be written by singleton private methods on regular timed intervals. Users will be able to write some data to private tables.
I want to avoid - if possible - to add the complications of handling a database (not even SQLite); so, I am planning to serialize my data on file(s) on disk, and deserialize them in memory whenever the PHP script is called.
Loading the whole data in memory for each PHP instance will not pose too heavy burden on the web server (I hope)... (The numbers are these: main data table size is planned to have a maximum of 100k records, each with a maximum record size of 1k bytes, so the data size will have a maximum possible size of 100MB, with a usual size of 10MB; The maximum number of concurrent users will never be higher than 100; these numbers are by design, no possibility to grow bigger).
The question is: can I use a PHP associative array to perform queries on multiple keys?
An example: this is my simplified main data structure:
<?php
$data = [
"1" => [
"name" => "Alice",
"zip" => "12345",
"many" => "A",
"other" => "B",
"fields" => "C",
],
"2" => [
"name" => "Bob",
"zip" => "67890",
"many" => "X",
"other" => "Y",
"fields" => "Z",
],
// ...
];
?>
To access a record by primary key, of course, I should do:
$key = "12345";
$record = $data[$key];
But, what if I want to (efficiently, i.e. avoiding a sequential scan...) access one or more records by a different key, say "zip"? Of course these keys could contain duplicated values. The only solution I came up with is to build a new array for each secondary key to "index", and serialize it alongside the main data table...
For example:
$zip_idx = [
"12345" => [ "1", "355", "99999", ],
"67890" => [ "2", "732", ],
// ...
];
and then:
$zip = "67890";
$records = $zip_idx[$zip];
So:
Do you see any issues, inconsistecies or lack of flexibility with this design?
Can you propose any smarter or more compact solution?
Do you have any consideration or objection?