Aspose.Words convert to html (only body content) - aspose

I can create word file and convert HTML with aspose.words API. How do I get the BODY content in HTML with the API (withou html,head,body tag/ only body content). I will use this to show the output in the WYSIWYG editors (summernote) application.
Note: I am developing the application with .net Framework (C#)

Document doc = new Document(MyDir + "inputdocx.docx");
var options = new Aspose.Words.Saving.HtmlSaveOptions(SaveFormat.Html)
{
ImageSavingCallback = new HandleImageSaving(),
};
String html = doc.FirstSection.Body.ToString(options);

By default, Aspose.Words saves html in Xhtml format, so you can safely load it into XmlDocument and get bydy tag’s content. For example see the following code.
// Create a simple document for testing.
DocumentBuilder builder = new DocumentBuilder();
builder.Writeln("Hello world!!!");
// For testing purposes insert an image.
builder.InsertImage(#"https://cms.admin.containerize.com/templates/aspose/App_Themes/V3/images/aspose-logo.png");
// Additional options can be specified in the corresponding save options.
HtmlSaveOptions opt = new HtmlSaveOptions(SaveFormat.Html);
// For example, output images in the HTML as base64 string (summernote supports base64)
opt.ExportImagesAsBase64 = true;
// Save the document to MemoryStream.
using (MemoryStream ms = new MemoryStream())
{
builder.Document.Save(ms, opt);
// Move the stream position ot the beginning and load the resulting HTML into Xml document.
ms.Position = 0;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(ms);
// Find body tag.
XmlNode body = xmlDoc.SelectSingleNode("//body");
// Get inner xml of the body.
Console.WriteLine(body.InnerXml);
}
Hope this helps.
Disclosure: I work at Aspose.Words team.

Related

Document Class of ASPOSE.WORDS.CLOUD

I am trying to convert a docx file to pdf using APOSE.WORDS-CLOUD SDK. But i am having the following error in Document class
Document doc = new Document("D:\Aspose\Template.docx");
error: 'Document' does not contain a constructor that takes 1 arguments
P.S please guide me to proper documentation/manual for using aspose.words.cloud SDK
#MetorGeek
There is no Document class in aspose-words-cloud .NET SDK. You may use the following example to convert Word document to PDF:
var format = "pdf";
using (var fileStream = File.OpenRead(BaseTestContext.GetDataDir(this.convertFolder) + "test_uploadfile.docx"))
{
var request = new PutConvertDocumentRequest(fileStream, format);
var result = this.WordsApi.PutConvertDocument(request);
}
Please check Converting a Document documentation section for a detailed description of Conversion APIs.

How to make an text Hyperlinked using Aspose.Words DOM approach?

I am trying to create a Word document using Aspose.Words for .NET using the DOM approach. How would I make an text hyperlinked?
Like when we click on text it should be route to web page from Docx.
Example : click here
This can be done by appending a hyperlink field to the paragraph. See the below sample code
// Create or load a document
Aspose.Words.Document wordDoc = new Aspose.Words.Document();
// Get first paragraph
Aspose.Words.Paragraph para = wordDoc.FirstSection.Body.FirstParagraph;
para.Runs.Add(new Run(wordDoc, "Visit "));
// Add the hyperlink field to the paragraph
FieldHyperlink field = (FieldHyperlink)para.AppendField(Aspose.Words.Fields.FieldType.FieldHyperlink, false);
// URL
field.Address = #"""http://www.aspose.com""";
// Text
field.Result = "Aspose";
field.Update();
// Set color of the last run
para.Runs[para.Runs.Count - 1].Font.Color = System.Drawing.Color.Blue;
// Save the document
string dst = (dataDir + #"hyperlink.docx");
wordDoc.Save(dst);
I work with Aspose as Developer Evangelist.

How to read word each page?

I know doc.Save() function save all page in one HTML file.
doc.RenderToScale() function save each page to the independent image file.
but i want read or save each page in independent HTML file,I had not idea,can you help me?
You can use the following code sample to convert each page to HTML or any other format supported by Aspose.Words.
String srcDoc = Common.DATA_DIR + "src.docx";
String dstDoc = Common.DATA_DIR + "dst {PAGE_NO}.html";
Document doc = new Document(srcDoc);
LayoutCollector layoutCollector = new LayoutCollector(doc);
// This will build layout model and collect necessary information.
doc.updatePageLayout();
// Split nodes in the document into separate pages.
DocumentPageSplitter splitter = new DocumentPageSplitter(layoutCollector);
// Save each page to disk as separate documents.
for (int page = 1; page <= doc.getPageCount(); page++)
{
Document pageDoc = splitter.getDocumentOfPage(page);
pageDoc.save(dstDoc.replace("{PAGE_NO}", page+""));
}
It depends on 3 other classes, which you can find in this zip file.
I work with Aspose as Developer Evangelist.

Saxon XSLT .Net Transformation: What to give in BaseURI when xml and xsl both are passed as strings

This is the code I have for Saxon Transformation of XSLT files which accepts xml and xslt and returns a transformed string. I can have either xsl 1.0 or 2.0 get processed through this function.
DocumentBuilder requires a BaseURI, even if I don't have any file format. I have provided "c:\\" as the BaseURI, inspite I have nothing to do with this directory.
Is there any better way to achieve this thing or write this function?
public static string SaxonTransform(string xmlContent, string xsltContent)
{
// Create a Processor instance.
Processor processor = new Processor();
// Load the source document into a DocumentBuilder
DocumentBuilder builder = processor.NewDocumentBuilder();
Uri sUri = new Uri("c:\\");
// Now set the baseUri for the builder we created.
builder.BaseUri = sUri;
// Instantiating the Build method of the DocumentBuilder class will then
// provide the proper XdmNode type for processing.
XdmNode input = builder.Build(new StringReader(xmlContent));
// Create a transformer for the stylesheet.
XsltTransformer transformer = processor.NewXsltCompiler().Compile(new StringReader(xsltContent)).Load();
// Set the root node of the source document to be the initial context node.
transformer.InitialContextNode = input;
StringWriter results = new StringWriter();
// Create a serializer.
Serializer serializer = new Serializer();
serializer.SetOutputWriter(results);
transformer.Run(serializer);
return results.ToString();
}
If you think that the base URI will never be used (because you never do anything that depends on the base URI) then the best strategy is to set a base URI that will be instantly recognizable if your assumption turns out to be wrong, for example "file:///dummy/base/uri".
Choose something that is a legal URI (C:\ is not).

Converting Files to PDF and attaching to another PDF in Coldfusion

So I'm doing a project that generates a PDF of information that was previously filled out in a form. Along with this information, documents were attached to support the information in the form.
I generate the PDF with the normal info from my DB, but I also want to convert their uploaded files (if .doc or .docx) to PDF format and stick in the same PDF. (So it is all in one place.)
I know how to convert to PDF, problem is how do you attach those newly generated PDFs to the current one with the other information on it?
you have 2 options:
merge all PDFs into one using <cfpdf action="merge"...>
really attach files in your main pdf but as CFPDF does not support it (yet?) you have to use iText:
<cfscript>
try {
// Source of THE main PDF and destination file
inputFile = ExpandPath("myDoc.pdf");
outputFile = ExpandPath("myDocPlusAttachments.pdf");
// the file to attach (can be of any type)
attach1 = ExpandPath("myAttachment.doc");
// prepare everything
reader = createObject("java", "com.lowagie.text.pdf.PdfReader").init( inputFile );
outStream = createObject("java", "java.io.FileOutputStream").init( outputFile );
stamper = createObject("java", "com.lowagie.text.pdf.PdfStamper").init( reader, outStream );
// attachment the file
stamper.addFileAttachment("My Attached File", javacast("null", ""), attach1, "myAttachment.doc");
// display the attachment pane when the pdf opens (Since 1.6)
writer = stamper.getWriter();
writer.setPdfVersion( writer.VERSION_1_6 );
}
finally {
// always cleanup objects
if (IsDefined("stamper")) {
stamper.close();
}
if (IsDefined("outStream")) {
outStream.close();
}
}
</cfscript>
Just found where I got that piece of code: ColdFusion 9: Adding Document Level Attachments to a PDF with iText
You need to use the CFPDF tag, and use the merge action.