Skip to content Skip to sidebar Skip to footer

How To Get An Image (inlineshape) From Paragraph Python Docx

I want to read the docx document paragraph by paragraph and if there is a picture (InlineShape), then process it with the text around it. The function Document.inline_shapes will g

Solution 1:

Assume your paragraph is par, you may use the following code to find the images

import xml.etree.ElementTree as ET
def hasImage(par):
    """get all of the images in a paragraph 
    :param par: a paragraph object from docx
    :return: a list of r:embed 
    """
    ids = []
    root = ET.fromstring(par._p.xml)
    namespace = {
             'a':"http://schemas.openxmlformats.org/drawingml/2006/main", \
             'r':"http://schemas.openxmlformats.org/officeDocument/2006/relationships", \
             'wp':"http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"}

    inlines = root.findall('.//wp:inline',namespace)
    for inline in inlines:
        imgs = inline.findall('.//a:blip', namespace)
        for img in imgs:     
            id = img.attrib['{{{0}}}embed'.format(namespace['r'])]
        ids.append(id)

    return ids

Post a Comment for "How To Get An Image (inlineshape) From Paragraph Python Docx"