I'm looking for a drop-in include script / class that dissects multipart/form-data and fills up $_POST(+raw) and $_FILES from it. Usually PHP does that itself. But because the automatic handling is insufficient for me and makes php://input inaccesible[1] I'll probably be using something like this to prevent that:

RewriteRule .* - [E=CONTENT_TYPE:noparsing/for-you-php]
Does not work. Actual solution requires mod_headers and RequestHeader set...

The extracting procedure might not be that complex. But I'd rather use a well-tested solution. And foremost I would prefer an implementation that uses fgets for splitting, and mimics the $_FILES handling closely and efficiently. Finding the end of binary payloads would seem rather tricky to me, in particular when you have to strip off \r\n but might encounter clients that only send \n (not allowed, but possible).

I'm certain something like this exists. But I'm having a hard time googling it. Does anyone know an implementation? (PEAR::mimeDecode can be hacked to get sort of working for form-data, but is a memory hog.)

The use case in short: need to preserve the raw field names (including whitespace and special characters), for logging, but can't avoid file uploads always.


For decorative purposes, that's how a POST request looks:

POST / HTTP/1.1
Host: localhost:8000
Content-Length: 17717
Content-Type: multipart/form-data; boundary=----------3wCuBwquE9P7A4OEylndVx

And after a \r\n\r\n sequence the multipart/ payload follows like this:

------------3wCuBwquE9P7A4OEylndVx
Content-Disposition: form-data; name="_charset_"

windows-1252
------------3wCuBwquE9P7A4OEylndVx
Content-Disposition: form-data; name=" text field \\ 1 \";inject=1"

text1 te twj sakfkl
------------3wCuBwquE9P7A4OEylndVx
Content-Disposition: form-data; name="file"; filename="dial.png"
Content-Type: image/png

IPNG Z @@@MIHDR@@B`@@B;HF@@@-'.e@@@AsRGB@.N\i@@@FbKGD@?@?@? ='S@@@     
@@@GtIMEGYAAU,#}BRU@@@YtEXtComment@Created with GIMPWANW@@ @IDATxZl]w|

Comments

This will probably become a bounty question..

Written by mario

Will it be guaranteed that each MIME part in the multipart will have a Content-Length? I don't remember if the spec requires this or not. I'd imagine it would.

Written by Charles

The bummer is that the spec RFC2388 does not mention Content-Length at all. While I would assume most current browsers do so (and use base64 encoding at least), I'm actually trying to support the more wacky clients. (Edit: No, not even Opera does it.)

Written by mario

Doesn't the "$argc" variable contain the raw post data? Or the empty string $_POST['']? Why do you say that automatic handling is insufficient? How does php://stdio work for you?

Written by David Freitas

@David: Neither argc nor argv are present for POST requests. And php://stdin and php://input are empty. PHP soaks up the complete POST body in main/rfc1867.c. That's why it is inaccessible. My issue is that leading spaces are stripped and many ASCII characters converted into _ underscores.

Written by mario

Accepted Answer

It's late and I can't test this at the moment but the following should do what you want:

//$boundary = null;

if (is_resource($input = fopen('php://input', 'rb')) === true)
{

    while ((feof($input) !== true) && (($line = fgets($input)) !== false))
    {
        if (isset($boundary) === true)
        {
            $content = null;

            while ((feof($input) !== true) && (($line = fgets($input)) !== false))
            {
                $line = trim($line);

                if (strlen($line) > 0)
                {
                    $content .= $line . ' ';
                }

                else if (empty($line) === true)
                {
                    if (stripos($content, 'name=') !== false)
                    {
                        $name = trim(stripcslashes(preg_replace('~.*name="?(.+)"?.*~i', '$1', $content)));

                        if (stripos($content, 'Content-Type:') !== false)
                        {
                            $tmpname = tempnam(sys_get_temp_dir(), '');

                            if (is_resource($temp = fopen($tmpname, 'wb')) === true)
                            {
                                while ((feof($input) !== true) && (($line = fgets($input)) !== false) && (strpos($line, $boundary) !== 0))
                                {
                                    fwrite($temp, preg_replace('~(?:\r\n|\n)$~', '', $line));
                                }

                                fclose($temp);
                            }

                            $FILES[$name] = array
                            (
                                'name' => trim(stripcslashes(preg_replace('~.*filename="?(.+)"?.*~i', '$1', $content))),
                                'type' => trim(preg_replace('~.*Content-Type: ([^\s]*).*~i', '$1', $content)),
                                'size' => sprintf('%u', filesize($tmpname)),
                                'tmp_name' => $tmpname,
                                'error' => UPLOAD_ERR_OK,
                            );
                        }

                        else
                        {
                            $result = null;

                            while ((feof($input) !== true) && (($line = fgets($input)) !== false) && (strpos($line, $boundary) !== 0))
                            {
                                $result .= preg_replace('~(?:\r\n|\n)$~', '', $line);
                            }

                            if (array_key_exists($name, $POST) === true)
                            {
                                if (is_array($POST[$name]) === true)
                                {
                                    $POST[$name][] = $result;
                                }

                                else
                                {
                                    $POST[$name] = array($POST[$name], $result);
                                }
                            }

                            else
                            {
                                $POST[$name] = $result;
                            }
                        }
                    }

                    if (strpos($line, $boundary) === 0)
                    {
                        //break;
                    }
                }
            }
        }

        else if ((is_null($boundary) === true) && (strpos($line, 'boundary=') !== false))
        {
            $boundary = "--" . trim(preg_replace('~.*boundary="?(.+)"?.*~i', '$1', $line));
        }
    }

    fclose($input);
}

echo '<pre>';
print_r($POST);
echo '</pre>';

echo '<hr />';

echo '<pre>';
print_r($FILES);
echo '</pre>';
Written by Alix Axel
This page was build to provide you fast access to the question and the direct accepted answer.
The content is written by members of the stackoverflow.com community.
It is licensed under cc-wiki