pydocmaker.backend.ex_docx

Attributes

can_run_pandoc

Document

log

gwin32

gcomposer

gmailmerge

Classes

msoPictureCompress

Enumeration of picture compression types for MS Office.

DocxFileW32

A context manager for working with DOCX files using Win32COM automation.

DocxFile

A class to handle operations on DOCX files such as appending multiple documents,

docx_renderer

Helper class that provides a standard way to create an ABC using

Functions

_make_output(bts, output_path_or_buffer)

_get_bytes_file_or_buffer(file_path_or_buffer)

_test_docxw32_installed([verb, force_reload])

tests if win32com and Microsoft Word is available

can_use_w32_word([verb, force_reload])

test if Microsoft Word is available and win32com library is available

can_use_libreoffice([force_reload])

test if libreoffice is available

blue(run)

red(run)

convert_pandoc(→ bytes)

convert(→ bytes)

Convert a list of document sections into a DOCX or PDF (via docx) file using a specified template.

Module Contents

pydocmaker.backend.ex_docx.can_run_pandoc
pydocmaker.backend.ex_docx.Document = None
pydocmaker.backend.ex_docx.log
pydocmaker.backend.ex_docx.gwin32 = None
pydocmaker.backend.ex_docx.gcomposer = None
pydocmaker.backend.ex_docx.gmailmerge = None
pydocmaker.backend.ex_docx._make_output(bts, output_path_or_buffer)
pydocmaker.backend.ex_docx._get_bytes_file_or_buffer(file_path_or_buffer)
pydocmaker.backend.ex_docx._test_docxw32_installed(verb=0, force_reload=False)

tests if win32com and Microsoft Word is available

Returns:

0 if both are available, 1 if win32com is not available 2 if win32com is available and word is not available.

Return type:

int

pydocmaker.backend.ex_docx.can_use_w32_word(verb=0, force_reload=False)

test if Microsoft Word is available and win32com library is available

Parameters:
  • verb (int, optional) – whether or not to give verbose info. Defaults to 0.

  • force_reload (bool, optional) – whether or not to force to re-test, if it has been tested before. Defaults to False.

Returns:

True if both are available, False if not

Return type:

bool

pydocmaker.backend.ex_docx.can_use_libreoffice(force_reload=False)

test if libreoffice is available

Parameters:

force_reload (bool, optional) – whether or not to force to re-test, if it has been tested before. Defaults to False.

Returns:

True if available, False if not

Return type:

bool

class pydocmaker.backend.ex_docx.msoPictureCompress(*args, **kwds)

Bases: enum.Enum

Enumeration of picture compression types for MS Office.

Default = 0
HQPrint = 1
Print = 2
Email = 3
Screen = 4
Photo = 16
class pydocmaker.backend.ex_docx.DocxFileW32(docx_path, outpath=None)

A context manager for working with DOCX files using Win32COM automation.

This class provides methods to manipulate DOCX documents via Microsoft Word’s COM interface, including image compression, field updates, and exporting to PDF format. It ensures proper cleanup by closing the Word application and document upon exiting the context manager.

WARNING! This class needs the win32com library and word installed to work properly!

static is_installed(verb=0, force_reload=False, ret_int=False)
docx_path = b'.'
outpath = None
word = None
worddoc = None
_created_word_app = False
__enter__()

Enter the runtime context for the DocxFileW32 instance. if “win32com.client” is not available it will be imported.

Returns:

The instance itself for use in a ‘with’ statement.

Return type:

DocxFileW32

compress_images(compressionQuality=msoPictureCompress.Screen)

Compress all images within the document based on specified quality settings.

Parameters:

compressionQuality (msoPictureCompress or int) – The compression level to apply to images. Defaults to msoPictureCompress.Screen (quality 4).

Returns:

The instance itself to allow method chaining.

Return type:

DocxFileW32

update_fields()

Update all fields in the document, including those in headers and footers.

Returns:

The instance itself to allow method chaining.

Return type:

DocxFileW32

export(pdf_path=None, optimize_for_screen=True)

Export the document to PDF format with specified options.

Parameters:
  • pdf_path (str, optional) – The path where the PDF will be saved. If None, uses outpath.

  • optimize_for_screen (bool) – True for wdExportOptimizeForOnScreen (=1) else wdExportOptimizeForPrint (=0).

Returns:

The instance itself to allow method chaining.

Return type:

DocxFileW32

Raises:

AssertionError – If pdf_path is None and outpath is also None.

save()
close()
__exit__(exc_type, exc_value, traceback)

Exit the runtime context for the DocxFileW32 instance, saving and closing resources.

Parameters:
  • exc_type – Exception type (if any) that occurred during execution.

  • exc_value – Exception value (if any) that occurred during execution.

  • traceback – Traceback object (if any) that occurred during execution.

class pydocmaker.backend.ex_docx.DocxFile(file_path_or_buffer)

A class to handle operations on DOCX files such as appending multiple documents, replacing fields, and saving the modified document.

file_path_or_buffer
docx_data
inp_data
_get_bytes_file_or_buffer(file_path_or_buffer)

Retrieve bytes from a file path or buffer.

Parameters:

file_path_or_buffer – Path to the DOCX file or a buffer containing DOCX data.

Returns:

Bytes of the DOCX file.

append(*files, verb=0) bytes

Append multiple Word (.docx) documents into a single document.

Parameters:
  • files – Variable length argument list of either (bytes, or file paths, or buffers) to be appended.

  • verb – Verbosity level (0 for silent, 1 for verbose).

Returns:

The current instance of DocxFile with the appended DOCX data.

replace_fields(replace_dict)

Replace all MergeFields of a DOCX file with given text in form of a dict.

Parameters:

replace_dict – Dictionary where keys are field names and values are replacement texts.

Returns:

The current instance of DocxFile with the replaced fields.

get_fields()

Retrieve the merge fields from the DOCX document.

Returns:

A list of merge fields present in the DOCX document.

Return type:

list

replace_keywords(replace_dict)

Edit raw XML content of a DOCX file by replacing specified strings using python-docx.

Parameters:

replace_dict – Dictionary where keys are strings to be replaced and values are replacement strings.

Returns:

The current instance of DocxFile with the replaced keywords.

replace_keywords_raw(replace_dict)

Edit raw XML content of a DOCX file by replacing specified strings in all XML files within the document.

Parameters:

replace_dict – Dictionary where keys are strings to be replaced and values are replacement strings.

Returns:

The current instance of DocxFile with the replaced keywords in raw XML.

save(output_path_or_buffer=None)

Save the modified DOCX data to a file or buffer.

Parameters:

output_path_or_buffer – File path or buffer to save the DOCX data.

Returns:

The current instance of DocxFile.

pydocmaker.backend.ex_docx.blue(run)
pydocmaker.backend.ex_docx.red(run)
pydocmaker.backend.ex_docx.convert_pandoc(doc: List[dict]) bytes
pydocmaker.backend.ex_docx.convert(doc: List[dict], template=None, template_params=None, use_w32=False, as_pdf=False, compress_images=False, filename=None, allow_pandoc=True, **kwargs) bytes

Convert a list of document sections into a DOCX or PDF (via docx) file using a specified template.

Parameters: - doc (List[dict]): The document content structured as a list of dictionaries. - template (str, optional): Path to a DOCX template file. Defaults to None. - template_params (dict, optional): Parameters to replace fields in the template. Defaults to None. - use_w32 (bool, optional): Whether to use win32com for document field updating and any of the following arguments, THIS OPTION NEEDS win32com and word installed. Defaults to False. - as_pdf (bool, optional): Whether to output the document as a PDF (via docx and win32com). Defaults to False. - compress_images (bool, optional): Whether to compress images in the document using win32com. Defaults to False. - filename (str, optional): The optional filename to give the document in case saving it as a tempfile is necessary. Default will try to get from metadata and if not found use tempfile.docx. - allow_pandoc (bool, optional): whether or not to allow the usage of pandoc instead of python-docx (usually pandoc creates nicer documents!) - **kwargs: only used to check if invalid keyword arguments were passed.

Returns: - bytes: The byte content of the generated DOCX or PDF file.

Raises: - ValueError: If attempting to export to PDF without win32com and Word.Application installed and use_w32 set to True.

class pydocmaker.backend.ex_docx.docx_renderer(template_path: str = None, make_blue=False)

Bases: pydocmaker.backend.baseformatter.BaseFormatter

Helper class that provides a standard way to create an ABC using inheritance.

d
make_blue = False
add_paragraph(newtext, *args, **kwargs)
add_run(text, *args, **kwargs)
digest_text(children, *args, **kwargs)
digest_str(children, *args, **kwargs)
digest_line(children, *args, **kwargs)
digest_markdown(children, *args, **kwargs)
digest_verbatim(children, *args, **kwargs)
digest_latex(children, *args, **kwargs)
handle_error(err, el=None) list
digest_iterator(children, *args, **kwargs)
digest_table(children=None, **kwargs) str
digest_image(children, *args, **kwargs)
abstractmethod format(*args, **kwargs)
doc_to_bytes()
save(filepath)