Count records returned from $extend in Dynamics 365 Web API - microsoft-dynamics

Basically, I have something like this:
https://company.crm.dynamics.com/api/data/v9.0/accounts?$select=accountid,accountnumber,name&$expand=crm_productbuyer($select=name)
I'd like to be able to just get a count of the number of records returned in the $expand instead of deserializing it in C# and doing a count on it. Just one lest thing to iterate over.
This didn't work for my case:
API $expand and &count
EDIT:
There are a number of things here that indicate this isn't doable:
https://learn.microsoft.com/en-us/dynamics365/customer-engagement/developer/webapi/retrieve-entity-using-web-api#retrieve-related-entities-for-an-entity-by-expanding-navigation-properties
You can’t use the /$ref or /$count path segments to return only the URI for the related entity or a count of the number of related entities.
This is a subset of the system query options described in the “11.2.4.2.1 Expand Options” section of OData Version 4.0 Part 1: Protocol Plus Errata 02. The options $skip, $count, $search, $expand and $levels aren’t supported for the Web API.
Will mark this an answer unless someone else chimes in to demonstrate otherwise.

Doesn't sound like it is doable:
https://learn.microsoft.com/en-us/dynamics365/customer-engagement/developer/webapi/retrieve-entity-using-web-api#retrieve-related-entities-for-an-entity-by-expanding-navigation-properties
You can’t use the /$ref or /$count path segments to return only the URI for the related entity or a count of the number of related entities.
This is a subset of the system query options described in the “11.2.4.2.1 Expand Options” section of OData Version 4.0 Part 1: Protocol Plus Errata 02. The options $skip, $count, $search, $expand and $levels aren’t supported for the Web API.

Related

Is there a equivalent utassert.eqtable (Available in utplsql version 2.X) in utplsql version 3.3?

After going through the documentations of Utplsql 3.0.2 , I couldn't find any references the assertion api as available in the older versions. Please let me know whether is there a equivalent assertion like utassert.eqtable available in newer versions.
I have just recently gone through the same pain. Most utPLSQL examples out there are for utPLSQL v2. It transpires appears that the assertions have been deprecated, and have been replaced by "Expects". I found a great blog post by Jacek Gebal that describes this. I've tried to put this and other useful links a page about how unit testing fits into Redgate's Oracle DevOps pipeline (I work for Redgate and we often get asked how to best implement automated unit testing for Oracle).
I don't think you can compare tables straight away, but you can compare cursors, which is quite flexible, because you can, for instance, set-up a cursor with test data based on a dual query, and then check that against the actual data in the table, something like this:
procedure TestCursorExample is
v_Expected sys_refcursor;
v_Actual sys_refcursor;
begin
-- Arrange (Nothing really to arrange, except setting the expectation).
open v_Expected for
select 'me#example.com' as Email
from dual;
-- Act
SomeUpsertProc('me', 'me#example.com');
-- Assert
open v_Actual for
select Email
from Tbl_User
where UserName = 'me';
ut.expect(v_Actual).to_equal(v_Expected);
end;
Also, the example above works in Oracle 11, but if you're in 12c, apparently things got even easier, because you can use the table operator with locally defined types.
I've used a similar solution to verify that certain columns of a row were updated, while others were not. You can easily open a cursor for the original data, with some columns replaces by the new fixed values. Then do the update. Then open a cursor with the new actual data of all columns. You still have to write the queries, but it's way more compact than querying everything into variables and comparing those individually.
And, because you can open the 'expected' cursor before doing the actual 'act' step of the test, you can be sure that the query with 'expected' data is not affected by the test itself, and can even base that cursor on the data you are going to modify.
For comparing the data, the cursors are serialized to XML. This may have some side effects. In the test example above, my act step didn't actually do anything, so I got this difference, showing the count as well as showing the missing data.
If your cursors have more columns, and multiple difference, it can sometimes take a seconds to spot the differences between the XML tags. Also, there are currently some edge-case issues with this, I think because of how trimming works in XML.
1) testcursorexample
Actual: refcursor [ count = 0 ] was expected to equal: refcursor [ count = 1 ]
Diff:
Rows: [ 1 differences ]
Row No. 1 - Missing: <EMAIL>me#example.com</EMAIL>
at "MySchema.MyTestPackage", line 410 ut.expect(v_Actual).to_equal(v_Expected);
See also: 'comparing cursors' from utPLSQL 3 concepts

Combining optional trailing slash pageviews in Google Analytics API pageviews report

I'm currently using this regex to grab a pageviews report for a certain segment of my site:
ga:pagePath=~^/cr/[a-zA-Z_0-9-]*\/*$
This returns all the pages under /cr to one level so it'll find /cr/somename and not /cr/somename/photos. The first level (somename) is all I want.
In the pageviews report, it breaks down separate numbers for results with or without trailing slashes:
/cr/somename/ 12
/cr/somename 4
/cr/othername 2
etc.
Is there anyway in the API to combine the results so that the report will give me a combined pageviews of 16 for /somename or will I have to handle this addition in my code?
There is no way to ask the API to combine and return results. You'll have to handle this on your end.

SAP JCo 3 RFC RSAQ_REMOTE_QUERY_CALL - unexpected results

We’re using JCo 3.0 to connect to RFCs and read data from SAP R/3. We use one RFC RFC_READ_TABLE often and use a second custom RFC to read employee information. My questions revolve around a third RFC RSAQ_REMOTE_QUERY_CALL. I'm calling an ad-hoc query I built in SAP using this RFC but I’m not getting the expected results. The main problem is that it appears that SAP is ignoring one of my selection criteria and using what was saved in SAP when I originally built it. The date criterion stored in my ad-hoc is 6/23/2013. If I pass in 6/28/2013 from JCo, I get the same results as if I had passed 6/23/2013 from JCo.
We have built several ad-hoc queries whose only criteria is a personnel number and call them successfully using RFC RSAQ_REMOTE_QUERY_CALL.
Background on my ad-hoc query: reporting period of today, joining together four aspects of an employee’s information: their latest action (hire, rehire, etc.), organization (e.g. company), pay (e.g. pay scale level) and communication (e.g. email). The query will run every workday.
Here are my questions:
My ad-hoc has three selection criteria. The first two are simple strings. The third is a date. The date will vary each time the query runs. We are referencing the first criteria using SP$00001, the second with SP$00002 and the third with SP$00003. The order of the criteria changes from the ad-hoc to SQ01 (what was SP$00001 in the ad-hoc is now SP$00003). Shouldn’t we reference them in the order defined in the ad-hoc (e.g. SP$00001)?
The two simple string selections are using OPTION “EQ”. The date criteria is using OPTION GT (greater than). Is “GT” correct?
We have some limited accessibility to SAP. Is there a way to see which SP$ parameters are mapped to which criteria?
If my ad-hoc was saved with five criteria but four of them never change when I call the ad-hoc from JCo, do I just need to set the value of the one or do I need to set the other four as well?
Do I have to call this ad-hoc using a variant (function.getImportParameterList().setValue(“VARIANT”, “VARIANT_NAME”))?
Does the Reporting Period have an impact on the date criteria? I have tried changing the Reporting Period to be PNPBEGDA = today and PNPENDDA = today and noticed no change.
Is there a way in SAP to get a “declaration” of your ad-hoc (name, inputs, outputs, criteria)? I have looked at JCoFunction.toXml() and JCoFunctionTemplate. These are good if you want to see something at runtime before it goes to SAP, but I’m looking for something I can use on the JCo end to help me write Java code that matches the ad-hoc.
I have looked at length on the web for answers to my questions and have not found anything that is useful. If there is anything which would help me, please let me know.
Thanks,
LM
Since I don't know much about SQnn, I won't be able to answer all of your questions...
I don't know, sorry.
It should be, at least it's the usual operator for greater than.
Yes - set an external breakpoint right inside the function module and trace its execution while performing the RFC call. Warning: At least basic ABAP knowledge required.
I don't know, sorry.
I don't know either, sorry.
That would depend on the query, I suspect...
JCo won't be able to help you out there - it doesn't know about queries, it only knows function modules. There might be other RSAQ_* function modules to get that information though.
I played with setting up a variant in SQ01 for my query. I added some settings in the variant that solved my problem and answered several of my questions in my post. The main thing I did was add a dynamically calculated date as part of my criteria. Here's how:
1. In SQ01, access menu "Go To" -> "Maintain Variants".
2. Choose your variant and in subobjects, choose "Attributes" and click "Change".
3. In the displayed list, find your date criterion.
4. Choose "D" in Selection Variable, choose a comparison option (mine was GT for greater than), and a "Name of a Variable" (really, this is the type of dynamic date calculation you need).
5. Go back to the Subobjects panel, choose "Values" and click "Change".
6. Enter any other criteria you need in the "Program selections" section.
7. Save the variant.
By doing this, I don't need to pass anything into the query from JCo. Also, SAP will automatically update the date criteria you entered in step #4 above.
So to to answer my questions from my original post:
1 and 4. It doesn't matter because I'm no longer passing anything in from JCo.
2. "GT" is Greater Than.
3 and 7. If anyone knows, I'd really like to find out.
5. Use the name you as it is in SAP (step #2 above).
6. I still don't know, but it's not holding me up.
I'm posting this in case anyone out there needs this type of information. Thanks to Esti and vwegert for helping me out.

Getting stocks by industry via Yahoo Finance

i want to list all available industries ( like: http://biz.yahoo.com/p/ ) and show all corresponding stocks.
Until now I'm using YAHOO.Finance.SymbolSuggest.ssCallback for the symbol suggestion and http://finance.yahoo.com/d/quotes.csv?s=... for getting the stock's data.
Does anyone have any idea how to get all industries and corresponding stocks?
Is there another hidden Yahoo API?
Lists of all available industries are called GICS Sectors for Standard and Poor's (S&P500 will use that) and ICB for Dow Jones and FTSE. Hence it used by Nasdaq, Nyse and others markets.
It seems like Yahoo uses a third industry classification by Morning Star, but since I'm not quite sure I will give both ways of retrieving data.
Morning Star
I don't know if Yahoo really sticks to this classification, but some names were really close so let's see it:
You need to go to their Index Data and in each sector, click on it and then at the bottom View complete index holdings.
It's not as precise as in Yahoo industry list, but it's all you can do with Morning Star. Not very convincing, I know...
GICS Sectors
GICS Sectors are now a trademark of Standard and Poor's and then data have to be sought for in S&P's website.
Short answer: take a look at this page, you will need to be registered (it's free and easy) and you can download spreadsheets (xls) with stocks and corresponding sectors. Nevertheless, things aren't always easy, and you will have to do a bit of a search to retrieve all stocks with their corresponding industries. For example, the file INDICATED_RATE_CHANGE.xls will give you some companies and their sectors in each month of 2012. Using that and SP500_DividendAristocrats_2012.xls you should be able to retrieve at least a large part of S&P 500 companies.
ICB
ICB is used by NYSE, NASDAQ etc... Then it's a lot simpler than S&P and MorningStar. Here is your answer. BOOM! Direct link!
Link is dead :(
Finally
I strongly advise you to use the simpler and most-used industry classification index: the ICB. It will always be available and publicly displayed since millions of investors relay everyday on it, without having to use S&P financial services or MorningStar brokerage services...
EDIT
You can look at nasdaq.com to retrieve all companies and their corresponding sector: here for Nasdaq and here for Nyse
Get all industry-IDs from here:
http://biz.yahoo.com/ic/ind_index.html
(look at the links)
Then use YQL ( https://developer.yahoo.com/yql/console/ )
with a query like this:
select * from yahoo.finance.industry where id=912

Create and use HTML full text search index (C++)

I need to create a search index for a collection of HTML pages.
I have no experience in implementing a search index at all, so any general information how to build one, what information to store, how to implement advanced searches such as "entire phrase", ranking of results etc.
I'm not afraid to build it myself, though I'd be happy to reuse an existing component (or use one to get started with a prototype). I am looking for a solution accessible from C++, preferrably without requiring additional installations at runtime. The content is static (so it makes sense to aggregate search information), but a search might have to accumulate results from multiple such repositories.
I can make a few educated guesses, though: create a map word ==> pages for all (relevant) words, a rank can be assigned to the mapping by promincence (h1 > h2 > ... > <p>) and proximity to top. Advanced searches could be built on top of that: searching for phrase "homo sapiens" could list all pages that contain "homo" and "sapiens", then scan all pages returned for locations where they occur together. However, there are a lot of problematic scenarios and unanswered questions, so I am looking for references to what should be a huge amount of existing work that somehow escapes my google-fu.
[edit for bounty]
The best resource I found until now is this and the links from there.
I do have an imlementation roadmap for an experimental system, however, I am still looking for:
Reference material regarding index creation and individual steps
available implementations of individual steps
reusable implementations (with above environment restrictions)
This process is generally known as information retrieval. You'll probably find this online book helpful.
Existing libraries
Here are two existing solutions that can be fully integrated into an application without requiring a separate process (I believe both will compile with VC++).
Xapian is mature and may do much of what you need, from indexing to ranked retrieval. Separate HTML parsing would be required because, AFAIK, it does not parse html (it has a companion program Omega, which is a front end for indexing web sites).
Lucene is a index/searching Apache library in Java, with an official pre-release C version lucy, and an unofficial C++ version CLucene.
Implementing information retrieval
If the above options are not viable for some reason, here's some info on the individual steps of building and using an index. Custom solutions can go from simple to very sophisticated, depending what you need for your application. I've broken the process into 5 steps
HTML processing
Text processing
Indexing
Retrieval
Ranking
HTML Processing
There are two approaches here
Stripping The page you referred to discusses a technique generally known as stripping, which involves removing all the html elements that won't be displayed and translating others to their display form. Personally, I'd preprocess using perl and index the resulting text files. But for an integrated solution, particularly one where you want to record significance tags (e.g. <h1>, <h2>), you probably want to role your own. Here is a partial implementation of a C++ stripping routine (appears in Thinking in C++ , final version of book here), that you could build from.
Parsing A level up in complexity from stripping is html parsing, which would help in your case for recording significance tags. However, a good C++ HTML parser is hard to find. Some options might be htmlcxx (never used it, but active and looks promising) or hubbub (C library, part of NetSurf, but claims to be portable).
If you are dealing with XHTML or are willing to use an HTML-to-XML converter, you can use one of the many available XML parsers. But again, HTML-to-XML converters are hard to find, the only one I know of is HTML Tidy. In addition to conversion to XHTML, its primary purpose is to fix missing/broken tags, and it has an API that could possibly be used to integrate it into an application. Given XHTML documents, there are many good XML parsers, e.g. Xerces-C++ and tinyXML.
Text Processing
For English at least, processing text to words is pretty straight forward. There are a couple of complications when search is involved though.
Stop words are words known a priori not to provide a useful distinction between documents in the set, such as articles and propositions. Often these words are not indexed and filtered from query streams. There are many stop word lists available on the web, such as this one.
Stemming involves preprocessing documents and queries to identify the root of each word to better generalize a search. E.g. searching for "foobarred" should yield "foobarred", "foobarring", and "foobar". The index can be built and searched on roots alone. The two general approaches to stemming are dictionary based (lookups from word ==> root) and algorithm based. The Porter algorithm is very common and several implementations are available, e.g. C++ here or C here. Stemming in the Snowball C library supports several languages.
Soundex encoding One method to make search more robust to spelling errors is to encode words with a phonetic encoding. Then when queries have phonetic errors, they will still map directly to indexed words. There are a lot of implementations around, here's one.
Indexing
The map word ==> page data structure is known as an inverted index. Its inverted because its often generated from a forward index of page ==> words. Inverted indexes generally come in two flavors: inverted file index, which map words to each document they occur in, and full inverted index, which map words to each position in each document they occur in.
The important decision is what backend to use for the index, some possibilities are, in order of ease of implementation:
SQLite or Berkly DB - both of these are database engines with C++ APIs that integrated into a project without requiring a separate server process. Persistent databases are essentially files, so multiple index sets can be search by just changing the associated file. Using a DBMS as a backend simplifies index creation, updating and searching.
In memory data structure - if your using a inverted file index that is not prohibitively large (memory consumption and time to load), this could be implemented as a std::map<std::string,word_data_class>, using boost::serialization for persistence.
On disk data structure - I've heard of blazingly fast results using memory mapped files for this sort of thing, YMMV. Having an inverted file index would involve having two index files, one representing words with something like struct {char word[n]; unsigned int offset; unsigned int count; };, and the second representing (word, document) tuples with just unsigned ints (words implicit in the file offset). The offset is the file offset for the first document id for the word in the second file, count is the number of document ids associate with that word (number of ids to read from the second file). Searching would then reduce to a binary search through the first file with a pointer into a memory mapped file. The down side is the need to pad/truncate words to get a constant record size.
The procedure for indexing depends on which backend you use. The classic algorithm for generating a inverted file index (detailed here) begins with reading through each document and extending a list of (page id, word) tuples, ignoring duplicate words in each document. After all documents are processed, sort the list by word, then collapsed into (word, (page id1, page id2, ...)).
The mifluz gnu library implements inverted indexes w/ storage, but without document or query parsing. GPL, so may not be a viable option, but will give you an idea of the complexities involved for an inverted index that supports a large number of documents.
Retrieval
A very common method is boolean retrieval, which is simply the union/intersection of documents indexed for each of the query words that are joined with or/and, respectively. These operations are efficient if the document ids are stored in sorted order for each term, so that algorithms like std::set_union or std::set_intersection can be applied directly.
There are variations on retrieval, wikipedia has an overview, but standard boolean is good for many/most application.
Ranking
There are many methods for ranking the documents returned by boolean retrieval. Common methods are based on the bag of words model, which just means that the relative position of words is ignored. The general approach is to score each retrieved document relative to the query, and rank documents based on their calculated score. There are many scoring methods, but a good starting place is the term frequency-inverse document frequency formula.
The idea behind this formula is that if a query word occurs frequently in a document, that document should score higher, but a word that occurs in many documents is less informative so this word should be down weighted. The formula is, over query terms i=1..N and document j
score[j] = sum_over_i(word_freq[i,j] * inv_doc_freq[i])
where the word_freq[i,j] is the number of occurrences of word i in document j, and
inv_doc_freq[i] = log(M/doc_freq[i])
where M is the number of documents and doc_freq[i] is the number of documents containing word i. Notice that words that occur in all documents will not contribute to the score. A more complex scoring model that is widely used is BM25, which is included in both Lucene and Xapian.
Often, effective ranking for a particular domain is obtained by adjusting by trial and error. A starting place for adjusting rankings by heading/paragraph context could be inflating word_freq for a word based on heading/paragraph context, e.g. 1 for a paragraph, 10 for a top level heading. For some other ideas, you might find this paper interesting, where the authors adjusted BM25 ranking for positional scoring (the idea being that words closer to the beginning of the document are more relevant than words toward the end).
Objective quantification of ranking performance is obtained by precision-recall curves or mean average precision, detailed here. Evaluation requires an ideal set of queries paired with all the relevant documents in the set.
Depending on the size and number of the static pages, you might want to look at an already existent search solution.
"How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles."
I would choose the Sphinx engine for full text searching. The licence is GPL but the also have a commercial version available. It is meant to be run stand-alone [2], but it can also be embedded into applications by extracting the needed functionality (be it indexing[1], searching [3], stemming, etc).
The data should be obtained by parsing the input HTML files and transforming them to plain-text by using a parser like libxml2's HTMLparser (I haven't used it, but they say it can parse even malformed HTML). If you aren't bound to C/C++ you could take a look at Beautiful Soup.
After obtaining the plain-texts, you could store them in a database like MySQL or PostgreSQL. If you want to keep everything embedded you should go with sqlite.
Note that Sphinx doesn't work out-of-the-box with sqlite, but there is an attempt to add support (sphinx-sqlite3).
I would attack this with a little sqlite database. You could have tables for 'page', 'term' and 'page term'. 'Page' would have columns like id, text, title and url. 'Term' would have a column containing a word, as well as the primary ID. 'Page term' would have foreign keys to a page ID and a term ID, and could also store the weight, calculated from the distance from the top and the number of occurrences (or whatever you want).
Perhaps a more efficient way would be to only have two tables - 'page' as before, and 'page term' which would have the page ID, the weight, and a hash of the term word.
An example query - you want to search for "foo". You hash "foo", then query all page term rows that have that term hash. Sort by descending weight and show the top ten results.
I think this should query reasonably quickly, though it obviously depends on the number and size of the pages in question. Sqlite isn't difficult to bundle and shouldn't need an additional installation.
Ranking pages is the really tricky bit here. With a large sample of pages you can use links quite a lot in working out ranks. Other wise you need to check how words seem to be placed, and also making sure your engine doesn't get fooled by 'dictionary' pages.
Good luck!