0

I am going to build a "simple" RESTful web service with PHP. I will provide APIs to access some data (via JSON) I collect on my web server. The main data table will be read-only for public API methods, and will be written by singleton private methods on regular timed intervals. Users will be able to write some data to private tables.

I want to avoid - if possible - to add the complications of handling a database (not even SQLite); so, I am planning to serialize my data on file(s) on disk, and deserialize them in memory whenever the PHP script is called.

Loading the whole data in memory for each PHP instance will not pose too heavy burden on the web server (I hope)... (The numbers are these: main data table size is planned to have a maximum of 100k records, each with a maximum record size of 1k bytes, so the data size will have a maximum possible size of 100MB, with a usual size of 10MB; The maximum number of concurrent users will never be higher than 100; these numbers are by design, no possibility to grow bigger).

The question is: can I use a PHP associative array to perform queries on multiple keys?

An example: this is my simplified main data structure:

<?php
    $data = [
        "1" => [
            "name" => "Alice",
            "zip" => "12345",
            "many" => "A",
            "other" => "B",
            "fields" => "C",
        ],
        "2" => [
            "name" => "Bob",
            "zip" => "67890",
            "many" => "X",
            "other" => "Y",
            "fields" => "Z",
        ],
        // ...
    ];
?>

To access a record by primary key, of course, I should do:

$key = "12345";
$record = $data[$key];

But, what if I want to (efficiently, i.e. avoiding a sequential scan...) access one or more records by a different key, say "zip"? Of course these keys could contain duplicated values. The only solution I came up with is to build a new array for each secondary key to "index", and serialize it alongside the main data table...

For example:

$zip_idx = [
    "12345" => [ "1", "355", "99999", ],
    "67890" => [ "2", "732", ],
    // ...
];

and then:

$zip = "67890";
$records = $zip_idx[$zip];

So:
Do you see any issues, inconsistecies or lack of flexibility with this design?
Can you propose any smarter or more compact solution?
Do you have any consideration or objection?

1 Answer 1

1

I would not create any further Arrays for other "indexes".

Just make a nice class for handling queries. a query for zip could look like this

class Data{

    protected $data;

    public function getByZip($zip){
        return array_filter($this->getData(),function($item)use($zip){
             if($item['zip'] == $zip) return true;
             return false;
        });
    }

    public function setData($data){
        $this->data = $data;
    }

    public function getData($data){
        return $this->data;
    }
}

$dataArray = [
    "1" => [
        "name" => "Alice",
        "zip" => "12345",
        "many" => "A",
        "other" => "B",
        "fields" => "C",
    ],
    "2" => [
        "name" => "Bob",
        "zip" => "67890",
        "many" => "X",
        "other" => "Y",
        "fields" => "Z",
    ],
    // ...
];

$data = new Data();

$data->setData($dataArray);

$result = $data->getByZip(12345);

you can also use the userid in the array and query for it this way.

greetings

edit: for your performance question -> normal you use databases for data that can get to 100MB. The reason is - if you use your array file database - the whole file with 100MB has to be read into memory. thats not quite an issue but most provider use a max memory limit of 128MB for your application and that could lead to problems.

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks... But, from php array_filter's docs (php.net/manual/en/function.array-filter.php): "Iterates over each value in the array"... :-( I'm afraid it will not be so efficient... :-)
it is very efficient - php array iterations are way faster than any database query. - if you wanna test performance just make a microtime() echo on start / end of the iteration to check the time it takes - it is the time you need to read the file that will be the bottleneck
Hmmm... I will give your solution a try as soon as possible... :-)
Also for efficiency you should use php serialize() to write the array into your file.
there is a small error in your code: you close array_filter( parenthesis before {, but it should be closed after }... By yhe way, it's very efficient, I will implement a solution based on array_filter... Thanks...
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.