The problem
Convert .doc
, .odt
, .docx
files to .pdf
, or another combination (i.e. .docx
to .odt
)
under PHP.
To solve this problem we’ll install Unoconv,
LibreOffice command tools and build a PHP Class.
Installing
Unoconv is Python tool that uses LibreOffice libs (pyuno).
Instaling LibreOffice command line tools
On a server you’re not required to make a full install of LibreOffice,
just command line and converters that you find on core.
For Ubuntu/Debian:
apt-get install openjdk-6-jdk libreoffice-core libreoffice-common libreoffice-writer libreoffice-script-provider-python
Important:
Installing libreoffice-writer
gives you support to convert TEXT documents.
To convert other formats (spreadsheets, presentations, images, etc), install the related LibreOffice package.
For images, consider lighter and well-known tools such ImageMagick;
For converting PDF to Text, consider PDF to Text – you can find it in Poppler-Utils package.
Installing o Unoconv
Installing libreoffice-writer
gives you support to convert TEXT documents.
To convert other formats (spreadsheets, presentations, images, etc), install the related LibreOffice package.
For images, consider lighter and well-known tools such ImageMagick;
For converting PDF to Text, consider PDF to Text – you can find it in Poppler-Utils package.
Installing o Unoconv
As root:
cd /tmp
git clone https://github.com/dagwieers/unoconv
cd unoconv/
make install
cd ../
rm -rf unoconv/
unoconv --listener &
So you’ve started LibreOffice/OpenOffice as a service running on a local port,
and you can check with ps aux | grep soffice
.
Some warnings:
- Unlike you can convert only with a LibreOffice/OpenOffice install, using the
service and unoconv is better for mass intensive operations, because you
reuse an instance always in memory. - unoconv package is already in Debian repositories but that’s an old version.
Showing support formats
unoconv --show
Creating a Deamon
To demonize unoconv (better for server mode), create a file /etc/init.d/unoconvd with the following content:
( Fonte )
#!/bin/sh
### BEGIN INIT INFO
# Provides: unoconvd
# Required-Start: $network
# Required-Stop: $network
# Default-Start: 2 3 5
# Default-Stop:
# Description: unoconvd - Converting documents to PDF by unoconv
### END INIT INFO
case "$1" in
start)
/usr/bin/unoconv --listener &
;;
stop)
killall soffice.bin
;;
restart)
killall soffice.bin
sleep 1
/usr/bin/unoconv --listener &
;;
esac
The adjust permissions, put on boot and run the daemon:
chmod 755 /etc/init.d/unoconvd
update-rc.d unoconvd defaults
service unoconvd start
Basic use
It doesn’t matter if you’ve started unoconv manualy or deamonized, you can
use as bellow to convert files:
unoconv --format pdf --output /OUTPUT_DIR/ file.odt
That will convert the file.odt
to file.pdf
on the informed output directory.
PHP Class
A simple PHP wrapper could be as bellow:
<?php
namespace Unoconv;
/**
* Unoconv class wrapper
*
* @author Rafael Goulart <rafaelgou@gmail.com>
* @see http://tech.rgou.net/
*/
class Unoconv {
/**
* Basic converter method
*
* @param string $originFilePath Origin File Path
* @param string $toFormat Format to export To
* @param string $outputDirPath Output directory path
*/
public static function convert($originFilePath, $outputDirPath, $toFormat)
{
$command = 'unoconv --format %s --output %s %s';
$command = sprintf($command, $toFormat, $outputDirPath, $originFilePath);
system($command, $output);
return $output;
}
/**
* Convert to PDF
*
* @param string $originFilePath Origin File Path
* @param string $outputDirPath Output directory path
*/
public static function convertToPdf($originFilePath, $outputDirPath)
{
return self::convert($originFilePath, $outputDirPath, 'pdf');
}
/**
* Convert to TXT
*
* @param string $originFilePath Origin File Path
* @param string $outputDirPath Output directory path
*/
public static function convertToTxt($originFilePath, $outputDirPath)
{
return self::convert($originFilePath, $outputDirPath, 'txt');
}
}
Sample use:
<?php
/**
* Sample use of Unoconv class
*
*/
require 'Unoconv.php';
use Unoconv\Unoconv;
// Converting to PDF
$originFilePath = 'test.odt';
$outputDirPath = './';
Unoconv::convertToPdf($originFilePath, $outputDirPath);
// Converting to DOCX
$originFilePath = 'test.odt';
$outputDirPath = './';
Unoconv::convert($originFilePath, $outputDirPath, 'docx');