How to debug PHP addDocument() failure?

Rather than using indexer --print-rt, I tried to write a PHP script that would populate the initial RT index with data from a MySQL database.

I presume the way to do this is to use a mysqli_query() statement to fetch rows from the database, then use a while loop to addDocument the fetched MySQL database rows one by one, like this:

if (mysqli_num_rows($sql_result) > 0) {
  while ($current_row = mysqli_fetch_row($sql_result)) {
    list($id, $field2, $field3) = $current_row;
    $indx->addDocument([
      'id' => $id,
      'field2' => $field2,
      'field3' => $field3
    ], $id);
    unset($id, $field2, $field3);
  }
}

Unfortunately, the script does not complete and my web server error log shows the following error:

PHP Fatal error:  Uncaught Manticoresearch\\Exceptions\\ResponseException:
"unsupported value type" in /home/user/public_html/vendor/manticoresoftware/manticoresearch-php/src/Manticoresearch/Transport/Http.php:127\nStack trace:\n#0
/home/user/public_html/vendor/manticoresoftware/manticoresearch-php/src/Manticoresearch/Client.php(357):
Manticoresearch\\Transport\\Http->execute()\n#1
/home/user/public_html/vendor/manticoresoftware/manticoresearch-php/src/Manticoresearch/Client.php(189):
Manticoresearch\\Client->request()\n#2
/home/user/public_html/vendor/manticoresoftware/manticoresearch-php/src/Manticoresearch/Index.php(87):
Manticoresearch\\Client->insert()\n#3
/home/user/public_html/populatemanticorertindex.php(92):
Manticoresearch\\Index->addDocument()\n#4 {main}\n
thrown in /home/user/public_html/vendor/manticoresoftware/manticoresearch-php/src/Manticoresearch/Transport/Http.php on line 127

where line 92 of populatemanticorertindex.php is the first line of the array assignment inside the addDocument() call.

I can’t tell exactly what causes the script to crash. I’m not sure how to determine which iteration of the while loop (i.e., which database table row) is the problematic one. What code should I be putting in as my error handling to help me debug?

Also if my script is not the right way to go about initially populating a RT index, please let me know the better way to do it.

$indx->addDocument([
    'field2' => $field2,
    'field3' => $field3
  ], $id);
1 Like

It turned out that the first row that tried to put a NULL value into a field defined (using create()) as ['type' => 'text'] or ['type' => 'int'] would crash the whole thing.

I debugged it by putting a test conditional with a ++ incremented counter in the while loop that stops the loop when the counter goes above a chosen value. I thus implemented my own manual binary search to home in on the last working counter value, and printed out (using PHP’s var_export()) the next (i.e., the first non-working) $current_row. I could see that one of the fields in the first non-working $current_row was NULL.

So doing something like this seems to work better:

if (mysqli_num_rows($sql_result) > 0) {
  while ($current_row = mysqli_fetch_assoc($sql_result)) {
    $indx->addDocument([
      'field2' => ($current_row['field2'] ?? ''),
      'field3' => ($current_row['field3'] ?? 0)
    ], $id);
  }
}

(presuming that field2 is ['type' => 'text'] and field3 is ['type' => 'int']).

I’m not sure why this breaks in the Manticore PHP client but indexing the same data doesn’t break the command line indexer. It seems to me that the Manticore PHP client should be able to handle these situations smoothly. That is, the Manticore PHP client should be able to permit for NULL database values without causing a fuss.

It may also be worth mentioning that I had to override the default maximum memory and time limit values in my script:

ini_set('memory_limit', '1G');
set_time_limit(0);

One issue here is that the ID that is stored in the RT table isn’t the value that I put in. It’s a 19-digit number instead. For example, if $id is 3391, it ends up being stored as 8793066877211036762. How can I make sure that I can later recover the same ID that I try to put into the table?

EDIT: Nevermind. For some reason, casting the ID to an int before putting it in the table did the trick:

$indx->addDocument([
    'field2' => $field2,
    'field3' => $field3
  ], (int)$id);

I don’t know why the casting is necessary for the BIGINT-type IDs but not the INT-type attributes.

I don’t know why the casting is necessary for the BIGINT-type IDs but not the INT-type attributes.

Please provide a full script which reproduces this as I can’t reproduce it on my end.

Here’s mine for example:

➜  ~ cat test_robert.php
<?php
require_once __DIR__ . '/vendor/autoload.php';

$config = ['host'=>'127.0.0.1','port'=>9308];
$client = new \Manticoresearch\Client($config);
$index = $client->index('test');
$index->drop(true);
$index->create([
    'field2'=>['type'=>'text'],
    'field3'=>['type'=>'text']
    ]);

$id = 3391;
$index->addDocument([
  'field2' => 'abc',
  'field3' => 'abc'
], $id);

➜  ~ php test_robert.php

➜  ~ mysql -P9306 -h0 -e "select * from movies"
+------+--------+--------+
| id   | field2 | field3 |
+------+--------+--------+
| 3391 | abc    | abc    |
+------+--------+--------+

What if instead of declaring the $id in PHP you retrieve it from a MySQL database? Even if it is an unsigned integer in MySQL, will it be retrieved as text instead of an integer, and could that be part of the problem?

Curiously, I noticed this discussion in which smazur reported having to cast the $id.

will it be retrieved as text instead of an integer, and could that be part of the problem?

$id as a string doens’t change anything. The script still works fine.

➜  ~ cat test_robert.php
<?php
require_once __DIR__ . '/vendor/autoload.php';

$config = ['host'=>'127.0.0.1','port'=>9308];
$client = new \Manticoresearch\Client($config);
$index = $client->index('test');
$index->drop(true);
$index->create([
    'field2'=>['type'=>'text'],
    'field3'=>['type'=>'text']
    ]);

$id = '3391';
$index->addDocument([
  'field2' => 'abc',
  'field3' => 'abc'
], $id);
➜  ~
➜  ~ php test_robert.php
➜  ~
➜  ~ mysql -P9306 -h0 -e "select * from movies"
+------+--------+--------+
| id   | field2 | field3 |
+------+--------+--------+
| 3391 | abc    | abc    |
+------+--------+--------+

Sorry, I can’t explain it. My script looks like yours, except I connect to a MySQL database with mysqli, set the charset as 'utf8', query the database to snag my data (including the unsigned integer ID number), go through the snagged data line by line using a mysqli_fetch_assoc($result), in a while loop, and try to put the data and the ID into the RT index using addDocument.

Possibly this was a bug that was fixed in a newer version of the PHP client (given that it’s in the changelog of 3.0.0)?

Possibly this was a bug that was fixed in a newer version of the PHP client

Might be. Can you check if it’s the case by trying the newest version?

<?php
require_once('../backend.php');
require_once('../utilities.php');
require_once __DIR__.'/../../vendor/autoload.php';
use Manticoresearch\Client;
use Manticoresearch\Index;
$config = ['host'=>'127.0.0.1','port'=>9308];
$client = new \Manticoresearch\Client($config);
$index = $client->index('test');
$index->drop(true);
$index->create([
    'field2'=>['type'=>'text'],
    'field3'=>['type'=>'text']
    ]);

@ $db = mysqli_connect(MYSQL_HOST, MYSQL_USER, MYSQL_PASSWORD, MYSQL_DATABASE);
mysqli_set_charset($db, 'utf8');

$query = "SELECT id, field2, field3 FROM mytable LIMIT 1";

$result = mysqli_query($db, $query);
$row = mysqli_fetch_assoc($result);
$index->addDocument([
  'field2' => $row['field2'],
  'field3' => $row['field3'],
], $row['id']);
var_dump($row['id']);

?>

Output of the var_dump: string(3) "414"

Output of the command mysql -P9306 -h0 -e "select * from test":

+---------------------+------------+------------+
| id                  | field2     | field3     |
+---------------------+------------+------------+
| 8793076944020176898 | field2data | field3data |
+---------------------+------------+------------+

Last time I checked, 414 does not equal 8793076944020176898.

In the underlying MySQL database, id is stored as INT(10) UNSIGNED.

BTW I am using the newest versions of Manticore and the PHP client.

If you run this:

➜  ~ cat test_robert.php
<?php
require_once __DIR__ . '/vendor/autoload.php';

$config = ['host'=>'127.0.0.1','port'=>9308];
$client = new \Manticoresearch\Client($config);
$index = $client->index('test');
$index->drop(true);
$index->create([
    'field2'=>['type'=>'text'],
    'field3'=>['type'=>'text']
    ]);

$id = '3391';
$index->addDocument([
  'field2' => 'abc',
  'field3' => 'abc'
], $id);
➜  ~
➜  ~ php test_robert.php
➜  ~
➜  ~ mysql -P9306 -h0 -e "select * from movies"
+------+--------+--------+
| id   | field2 | field3 |
+------+--------+--------+
| 3391 | abc    | abc    |
+------+--------+--------+

what do you get?

Replacing select * from movies with select * from test, I get:

+---------------------+--------+--------+
| id                  | field2 | field3 |
+---------------------+--------+--------+
| 8793076944020176899 | abc    | abc    |
+---------------------+--------+--------+

OK, great! What’s the exact version of Manticore and the php client you are using?

Also, what’s your PHP version and your operating system?

Ubuntu 22.04.2 LTS
MariaDB 15.1 Distrib 10.6.12-MariaDB
PHP 8.1.5
Manticore 6.0.4
Manticore PHP Client: 3.0.0

I’ve reproduced the issue. I had a bug in the above example:

$index = $client->index('test');
...
mysql -P9306 -h0 -e "select * from movies"

(different tables).

I’ve created an issue about it string id turns into incorrect int · Issue #115 · manticoresoftware/manticoresearch-php · GitHub .
Thanks for pointing this out!

1 Like