Ever wondered how those email-based systems actually work behind the scenes? You know, the ones where you send an email to a specific address and – like magic – it shows up on a website or triggers some action? I’ve had experience building such systems, and today I’m going to pull back the curtain and show you exactly how to parse email with PHP.
In this comprehensive guide, I’ll walk you through:
Let’s dive right in!
Email piping is the backbone of email processing. It’s the mechanism that forwards incoming emails to your PHP script for processing. Here’s the basic flow:
This process enables you to build powerful applications that react to email content – forums, ticketing systems, mailing lists, and much more.
While this part doesn’t require coding, it’s crucial to get right. Most hosting providers offer this feature through cPanel or similar control panels.
This is where many beginners make mistakes. When entering the path to your script, remember:
A proper command looks like this:
/usr/local/bin/php -q /home/yourdomain/public_html/handlers/email_parser.php The -q flag suppresses HTTP headers, which is essential for proper processing.
Not all hosting plans support email piping. If you don’t see this option, contact your hosting provider. Sometimes it’s available on higher-tier plans or can be enabled upon request.
I’ve personally set this up on many hosts including HostGator, BlueHost, SiteGround, and A2Hosting – they all support it, but sometimes with minor differences in configuration.
Now that emails are being forwarded to your script, you need to capture their content. This requires a special directive at the very start of your file – before any PHP tags:
#!/usr/local/bin/php -q
<?php
// Your code starts hereCode language: PHP (php) This “shebang” line tells the server to use the PHP interpreter. The path must match what you specified in the email forwarder.
Next, we need to read the raw email from standard input. Here’s the code:
$fd = fopen("php://stdin", "r");
$email_content = "";
while (!feof($fd)) {
$email_content .= fread($fd, 1024);
}
fclose($fd);Code language: PHP (php) This code opens PHP’s standard input stream, reads everything available, and stores it in the $email_content variable. Simple but effective!
At this point, you have the entire email captured in a string, but it’s not immediately usable. Raw emails include headers, content type information, boundaries for attachments, and more.
Here’s a simplified example of what you’ll see:
Return-path: <sender@example.com>
Received: from mail-server.example.com ([192.168.1.1]:12345)
MIME-Version: 1.0
Date: Wed, 11 May 2025 14:32:01 +0600
Subject: Important information about your account
From: "John Doe" <john@example.com>
To: "Support Team" <support@yourdomain.com>
Content-Type: multipart/alternative; boundary=abc123
--abc123
Content-Type: text/plain; charset=UTF-8
This is the plain text version of the email.
--abc123
Content-Type: text/html; charset=UTF-8
<html>
<body>
<p>This is the <strong>HTML</strong> version of the email.</p>
</body>
</html>
--abc123--Code language: HTML, XML (xml) It definitely looks messy – but don’t worry. We’ll parse it step by step.
Now for the fun part – extracting useful information from this raw data. We’ll use a straightforward approach to separate headers from the body and then extract specific header values:
// Split email into lines
$lines = explode("\n", $email_content);
// Initialize variables
$from = "";
$to = "";
$subject = "";
$headers = "";
$message = "";
$is_header = true;
// Process line by line
foreach ($lines as $line) {
// Check if we're still in the header section
if ($is_header) {
// Store all headers for reference
$headers .= $line . "\n";
// Extract specific headers
if (preg_match("/^Subject: (.*)/i", $line, $matches)) {
$subject = $matches[1];
} elseif (preg_match("/^From: (.*)/i", $line, $matches)) {
$from = $matches[1];
} elseif (preg_match("/^To: (.*)/i", $line, $matches)) {
$to = $matches[1];
}
// Empty line marks the end of headers
if (trim($line) == "") {
$is_header = false;
}
} else {
// We're in the message body
$message .= $line . "\n";
}
}Code language: PHP (php) This code loops through each line, collecting header information until it finds an empty line (which separates headers from the body). After that, it considers everything as the message body.
Real emails are more complex than our simple example. Most modern emails use MIME (Multipurpose Internet Mail Extensions) and include both plain text and HTML versions of the content, plus potentially attachments. Understanding them is crucial to parse email with PHP.
To properly handle these emails, we need to identify the content type and boundaries:
// Extract content type and boundary
$content_type = "";
$boundary = "";
if (preg_match('/Content-Type: (.*?);/i', $headers, $matches)) {
$content_type = $matches[1];
if (preg_match('/boundary="?(.*?)"?(\s|$)/i', $headers, $matches)) {
$boundary = $matches[1];
}
}
// For multipart emails, extract the parts
$message_parts = array();
if ($boundary && strpos($content_type, 'multipart/') === 0) {
$parts = explode("--" . $boundary, $message);
foreach ($parts as $part) {
if (trim($part) != "" && $part != "--") {
$message_parts[] = $part;
}
}
}Code language: PHP (php) This code identifies whether the email is multipart (containing multiple formats) and extracts each part separately.
Now we can identify and extract specific content types from the email:
$plain_text = "";
$html_content = "";
foreach ($message_parts as $part) {
// Check content type of this part
if (preg_match('/Content-Type: text\/plain/i', $part)) {
// Extract plain text after the first empty line
$plain_text = preg_replace('/^.*?\r?\n\r?\n/s', '', $part);
} elseif (preg_match('/Content-Type: text\/html/i', $part)) {
// Extract HTML after the first empty line
$html_content = preg_replace('/^.*?\r?\n\r?\n/s', '', $part);
}
}
// If no parts were found, use the entire message as plain text
if (empty($message_parts) && trim($message) != "") {
$plain_text = $message;
}Code language: PHP (php) This gives us both the plain text and HTML versions of the email content, which you can use according to your application’s needs.
Many applications need to handle email attachments. Here’s how to extract them:
$attachments = array();
foreach ($message_parts as $part) {
// Check if this part has a filename (indicates attachment)
if (preg_match('/Content-Disposition: attachment; filename="?(.*?)"?(\s|$)/i', $part, $matches)) {
$filename = $matches[1];
// Extract content type
$attachment_type = "";
if (preg_match('/Content-Type: (.*?);/i', $part, $matches)) {
$attachment_type = $matches[1];
}
// Extract content transfer encoding
$encoding = "";
if (preg_match('/Content-Transfer-Encoding: (.*?)/i', $part, $matches)) {
$encoding = trim($matches[1]);
}
// Extract content after headers
$content = preg_replace('/^.*?\r?\n\r?\n/s', '', $part);
// Decode content based on encoding
if (strtolower($encoding) == 'base64') {
$content = base64_decode($content);
} elseif (strtolower($encoding) == 'quoted-printable') {
$content = quoted_printable_decode($content);
}
// Store attachment info
$attachments[] = array(
'filename' => $filename,
'type' => $attachment_type,
'content' => $content
);
}
}Code language: PHP (php) Now you can save these attachments to disk or process them in memory.
Email clients can use various character encodings, leading to issues with special characters. Here’s how to handle common encoding problems:
// Handle quoted-printable encoding in subject
if (preg_match('/=\?UTF-8\?Q\?(.*?)\?=/i', $subject, $matches)) {
$subject = quoted_printable_decode(str_replace('_', ' ', $matches[1]));
}
// Handle character encoding in message body
if (preg_match('/charset="?(.*?)"?(\s|$)/i', $headers, $matches)) {
$charset = $matches[1];
if (strtolower($charset) != 'utf-8') {
// Convert to UTF-8 if possible
if (function_exists('mb_convert_encoding')) {
$plain_text = mb_convert_encoding($plain_text, 'UTF-8', $charset);
$html_content = mb_convert_encoding($html_content, 'UTF-8', $charset);
}
}
}Code language: PHP (php) This handles the most common encoding issues you’ll encounter, especially with non-English content.
Let’s put it all together in a complete script:
#!/usr/local/bin/php -q
<?php
// Log file for debugging
$log_file = '/path/to/email_log.txt';
// Read email from stdin
$fd = fopen("php://stdin", "r");
$email_content = "";
while (!feof($fd)) {
$email_content .= fread($fd, 1024);
}
fclose($fd);
// Log the raw email for debugging
file_put_contents($log_file, date('Y-m-d H:i:s') . " - Raw Email:\n" . $email_content . "\n\n", FILE_APPEND);
// Split email into lines
$lines = explode("\n", $email_content);
// Initialize variables
$from = "";
$to = "";
$subject = "";
$headers = "";
$message = "";
$is_header = true;
// Process line by line
foreach ($lines as $line) {
if ($is_header) {
$headers .= $line . "\n";
if (preg_match("/^Subject: (.*)/i", $line, $matches)) {
$subject = $matches[1];
} elseif (preg_match("/^From: (.*)/i", $line, $matches)) {
$from = $matches[1];
} elseif (preg_match("/^To: (.*)/i", $line, $matches)) {
$to = $matches[1];
}
if (trim($line) == "") {
$is_header = false;
}
} else {
$message .= $line . "\n";
}
}
// Extract content type and boundary
$content_type = "";
$boundary = "";
if (preg_match('/Content-Type: (.*?);/i', $headers, $matches)) {
$content_type = $matches[1];
if (preg_match('/boundary="?(.*?)"?(\s|$)/i', $headers, $matches)) {
$boundary = $matches[1];
}
}
// For multipart emails, extract the parts
$message_parts = array();
$plain_text = "";
$html_content = "";
if ($boundary && strpos($content_type, 'multipart/') === 0) {
$parts = explode("--" . $boundary, $message);
foreach ($parts as $part) {
if (trim($part) != "" && $part != "--") {
// Check content type of this part
if (preg_match('/Content-Type: text\/plain/i', $part)) {
$plain_text = preg_replace('/^.*?\r?\n\r?\n/s', '', $part);
} elseif (preg_match('/Content-Type: text\/html/i', $part)) {
$html_content = preg_replace('/^.*?\r?\n\r?\n/s', '', $part);
} elseif (preg_match('/Content-Disposition: attachment/i', $part)) {
// Process attachment if needed
// ...
}
}
}
} else {
// Not multipart, use entire message as plain text
$plain_text = $message;
}
// Do something with the extracted data
// For example, store in database:
$db = new PDO('mysql:host=localhost;dbname=your_database', 'username', 'password');
$stmt = $db->prepare("INSERT INTO emails (sender, subject, text_content, html_content, received_at)
VALUES (:sender, :subject, :text, :html, NOW())");
$stmt->execute([
'sender' => $from,
'subject' => $subject,
'text' => $plain_text,
'html' => $html_content
]);
// Log the processed information
$log_message = date('Y-m-d H:i:s') . " - Processed Email:\n";
$log_message .= "From: $from\nSubject: $subject\n";
$log_message .= "Plain Text: " . substr($plain_text, 0, 100) . "...\n\n";
file_put_contents($log_file, $log_message, FILE_APPEND);Code language: PHP (php) When processing emails, always remember they come from an external source, making them potential security risks:
// Sanitize data before using in HTML context
$safe_subject = htmlspecialchars($subject, ENT_QUOTES, 'UTF-8');
$safe_from = htmlspecialchars($from, ENT_QUOTES, 'UTF-8');
// Use prepared statements for database operations
// (as shown in the complete example)Code language: PHP (php) Email processing can be tricky to debug since you can’t just reload a page to test it. Create a robust logging system:
function log_email_processing($message, $data = null) {
$log_file = '/path/to/email_logs/debug_' . date('Y-m-d') . '.log';
$log_entry = date('Y-m-d H:i:s') . " - $message";
if ($data !== null) {
$log_entry .= "\n" . print_r($data, true);
}
file_put_contents($log_file, $log_entry . "\n\n", FILE_APPEND);
}
// Usage:
log_email_processing("Processing started");
log_email_processing("Extracted headers", $headers);Code language: PHP (php) Different email clients format emails differently. Test your parser with various clients:
// Detect email client from headers
$email_client = "unknown";
if (strpos($headers, 'X-Mailer: Microsoft Outlook') !== false) {
$email_client = "outlook";
// Apply Outlook-specific parsing adjustments
} elseif (strpos($headers, 'User-Agent: Mozilla/5.0') !== false &&
strpos($headers, 'Thunderbird') !== false) {
$email_client = "thunderbird";
// Apply Thunderbird-specific parsing adjustments
}Code language: PHP (php) This is usually an encoding issue. Make sure you’re properly handling character encodings:
// For quoted-printable encoded text (common in emails)
$plain_text = quoted_printable_decode($plain_text);
// Then ensure UTF-8 encoding
if (function_exists('mb_convert_encoding')) {
$plain_text = mb_convert_encoding($plain_text, 'UTF-8', 'AUTO');
}Code language: PHP (php) Email line breaks can be problematic:
// Normalize line breaks
$plain_text = str_replace(["\r\n", "\r"], "\n", $plain_text);
// For HTML content display
$html_for_display = nl2br($plain_text);Code language: PHP (php) Sometimes your script might process the same email multiple times:
// Generate a unique email ID based on content
$email_id = md5($from . $subject . substr($message, 0, 100) . time());
// Check if this email was already processed
$stmt = $db->prepare("SELECT id FROM processed_emails WHERE email_id = ?");
$stmt->execute([$email_id]);
if ($stmt->rowCount() > 0) {
// Already processed this email, exit
exit;
}
// Mark as processed
$db->prepare("INSERT INTO processed_emails (email_id, processed_at) VALUES (?, NOW())")
->execute([$email_id]);Code language: PHP (php) Parsing email with PHP is an incredibly powerful technique that opens up many possibilities for web applications. Whether you’re building a forum system, a support ticket platform, or just automating your workflow, understanding how to process incoming emails is an essential skill.
Remember, the key points are:
I hope this guide has given you a solid foundation on how to parse email with PHP. The examples provided should work across most hosting environments and with most email clients.
What will you build with this knowledge? Let me know in the comments!
A: Absolutely! Most shared hosting providers support email piping. Just make sure to use the correct paths for your specific host.
A: You can stream attachments directly to disk instead of loading them into memory:
// When processing an attachment
$tmp_file = tempnam(sys_get_temp_dir(), 'email_attachment_');
file_put_contents($tmp_file, $attachment_content);
// Process the file on disk instead of in memoryCode language: PHP (php) A: Yes, the techniques shown work with PHP 7.x and PHP 8.x. For older PHP versions (5.x), you might need to adjust some syntax.
A: Save some sample raw emails to text files, then simulate the piping process:
bashcat sample_email.txt | php -q your_handler.php A: Extract and store the Message-ID header from incoming emails. For replies, look for the In-Reply-To header which contains the Message-ID of the original email.
Learn python file handling from scratch! This comprehensive guide walks you through reading, writing, and managing files in Python with real-world examples, troubleshooting tips, and…
You've conquered the service worker lifecycle, mastered caching strategies, and explored advanced features. Now it's time to lock down your implementation with battle-tested service worker…
Unlock the full potential of service workers with advanced features like push notifications, background sync, and performance optimization techniques that transform your web app into…
This website uses cookies.
View Comments
Thanks for your good solution.
Thanks! This was very useful.
Very good, thanks.
I added an insert statement to drop the data from the email message including email address etc into a MySQL DB - but I get a bunch of unwanted stuff in the database appearing in the 'message' column which looks like headers of the email like this:
"This is a multi-part message in MIME format.
------=_NextPart_000_0101_01CEAA6E.6A79E160
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
I also see quotes appearing in the other columns - how to get rid of any unwanted characters?
You will need to filter them with regular expression as far I can suggest. Don't see any other way around atm, email contents usually do have such lots of unnecessary characters for formatting probably, which differs from client to client(yahoo/gmail/outlook etc).
I have my text in spanish but when I answer the mail with outlook, I get unwanted characters where I use spanish accents á ú í é ó and in the line breack I get =20, can you give me a hint on how can I replace this charaters with the vocals and with the accent again ?
thanks for your help
NICE OWRK
Hello! I know this post is a little bit cold... but I wonder if you might help? I have implemented this code, and it works with one exception... each time a mail is successfully parsed i get 7 mails.
The processing i have programmed is supposed to identify the subject, and then add that as the body of a mail to my office email address.
only I get 7 identical emails!
Any thoughts or direction would be much appreciated!
cheers
LAex
Hi ,
Thanks for the post. It's realy useful. I have implement this functionality and It's working fine.
Regards,
Ashok Singh
Hi,
How can I use this to also retrieve attachments from the email?
Regards,
How can add forwaders path to capanel in framework codeigniter
You can put something like "/path/to/domain/index.php/controller/method" easily to get it working. Just imagine, .htaccess won't have any effect as this is not going through Apache server, rather direct access.
I have implemented the script and it is working fine. The only issue is that I am trying to send an email back to the sender with an output text file of the parsed info as an attachment. The output text file looks good, however the sender is receiving the email without the attachment.
Is it possible to send an attachment email from the script itself since the sending email part with attachment works when executed separately ?