How to scrape an unordered list with go-colly? - list

I am trying to build a personal scraper of food recipes. I am able to get all other elements but food ingredients that are in unordered list.
Here is a snippet of the page html:
pagehtml
My code so far that doesn't find strong element but prints "Ingredients found."
collectorDish.OnHTML(".ingredients", func(element *colly.HTMLElement) {
fmt.Println("Ingredients found")
element.ForEach("li", func(_ int, el *colly.HTMLElement) {
fmt.Println(el.ChildText("strong"))
el.ForEach("strong", func(_ int, elem *colly.HTMLElement) {
fmt.Println(elem.Text)
})
})
})
I have tried different ways to get these elements but no luck so far.
I noticed that there is a difference of data when inspecting the page html. Under "Inspect -> elements" the html is as shown on the image, but in "Inspect->Source->pagename" the html stands:
<ul class="ingredients">
<div class="ellipsis">
<div></div>
<div></div>
<div></div>
<div></div>
So is the reason why I don't receive ingredients in my code or the way page is built? I am a complete noobie and don't understand why html looks different in elements vs source. Looking for anykind of clues to get it working. Thanks and all the best!

Related

How to filter the html markups when render a template with jinja2?

Now I'm biulding a django project with jinja2 dealing with templates. Some page contents are submited by the client with wysiwy editor, and thing's going fine with the detail pages.
But the list pages are wrong with the slice of the contents.
My code:
<div class="summary ">
<div class="content">{{ question.content[:200]|e}}...</div>
</div>
But the output is:
<p>what i want to show here is raw text without markups</p>...
The expected result is that the html markups like <p></p> <section>.... are gone (filtered or eliminated) and only the raw text shows!
So how can I fix it? Thanks in advance!
Use striptags filter:
striptags(value)
Strip SGML/XML tags and replace adjacent whitespace
by one space.
<div class="content">{{ question.content|striptags}}...</div>
Jinja2 striptags filter test will also help you to understand how it works.
Hope that helps.

Expression Engine categories channel outputting empty list

I have a problem with outputting categories in to a list.
It seems to create another ul(one for the channel:categories) within the my ul, and also creates empty lists before each list.
I used the exact same code for entries and it worked fine.
Is this a categories problem?
Here is the Code:
<ul>
<li><a {if segment_2 == ""} class="selected" {/if} href="">News & Events</a></li>
{exp:channel:categories
channel="news_events"
disable="pagination|member_data|trackbacks"
dynamic="no"}
<li>{category_name}</li>
{/exp:channel:categories}
<li><a {if segment_2 == "gallery"} class="selected"{/if} href="">Image Gallery</a></li>
Any help would be appreciated!
With regard to the code output, take a look at the style parameter for the channel categories tag: http://ellislab.com/expressionengine/user-guide/modules/channel/categories.html#channel-categories-style. You'll want to change it to style="linear" to use your own markup.
As for the blank output, it's going to be tough to diagnose without seeing your install but try getting rid of dynamic="no".

ModX: Using Ditto with template variables

I am having a great deal of difficulty getting my head round displaying secveral resources on one page with Ditto. I cant seem to get TV's to show along with my content.
Heres how I have set it out:
I have a page with my Ditto call:
[!Ditto? &parents='134' &orderBy='createdon ASC' &tpl='temp'!]
I have a simple chunk called temp set up as such:
<div id="content">
[*articlename*]
[+content+]
</div>
And I have a template with the TV articlename assigned to all the resource under parent 134.
The content shows fine but none of the TV's do. Can anyone point me in the right direction? thanks!
I think the problem is in your syntax. You need to use a placeholder tag in the chunk for your TV:
Try this:
<div id="content"> [+articlename+] [+content+] </div>
I have found the answer: You are meant to use [+articlename+] for 'chunk TVs' rather then [*articlename*]. This is different to getResources.

can't add a link to an entire div section

I have a problem with TinyMCE in Joomla 2.5.4. I have tried for a few days now to add a link to a div section (like <div> something< </div> ) but failed, the anchor is stripped from the HTML section because TinyMCE sees that as being wrong in HTML4. After a 3 days research I gave up and instead of a div i used a unordered list.
Now when i try to add a link to a list item (like <li> <p> something </p> </li> ) TinyMCE rearranges everything and moves the anchor inside of the list item (like <li> <a href="#"> <p> something </p> &=lt;/a> </li>).
I have tried pretty much everything from valid_elements : "[]" to text filter: No Filtering but i ran low on ideas.
Can anyone please help me?
Try playing around with TinyMCE's html5 options: http://www.tinymce.com/tryit/html5_formats.php
Hit "view source" to see how they're doing it. It's mainly this option inside tinyMCE.init:
schema: "html5",

Managing a list in umbraco 5

I have recently started working on an umbraco 5 project and am finding it a bit of a struggle compared with umbraco 4. What I am trying to do is have a list of items that are managed in the content section of the site where users can add items to a list, this list is then used in drop downs and page categories throughout the site.
I am looking for the best way to do this, I am currently part way through creating a property editor that manages a list of text boxes but not sure if this is the best way of doing it, and Im currently not entirely sure of how to go about doing even this. I can save one property no problem, but a dynamic list?
Can anybody give me some idea of how they would go about doing this with some code examples, theres not a huge amount of resources for 5 out there at the minute.
Many thanks to those who contribute.
UPDATE
I have now copied the multiple textstring property editor from the source code and am looking to update it to have an extra text input. Its using the knockout javascript library of which Im not too familar with, below is the code I have so far, does anyone know how I would update this to save both text values to the database?
<ul class="relatedlinks-textstring-inputs" data-bind="template: { name: 'textstringRow', foreach: textstrings }"></ul>
<script id="textstringRow" type="text/html">
<li style='width: 250px;'>Name<input type="text" data-bind="value: value" />Url<input type="text" data-bind="value: value" /></li>
</script>
#Html.HiddenFor(x => Model.Value, new Dictionary<string, object> { { "data-bind", "value: value" } })