May I help you?

Friday, April 10, 2015

AdWords and Generalized On-line Matching.The objective is to maximize the total revenue while respecting the daily budgets.

How does a search engine company like Google decide what ads to display with each query so as to maximize its revenue? This turns out to be a generalization of the online bipartite matching problem.

We here introduce the notion of a tradeoff revealing LP and use it to derive an optimal algorithm achieving a competitive ratio of 1 - 1/e for this problem.

Introduction
Internet search engine companies, such as Google, Yahoo and MSN, have revolutionized notonly the use of the Internet by individuals but also the way businesses advertise to consumers.Instead of flooding consumers with unwanted ads, search engines open up the possibility of a dialogue between consumers and businesses, with consumers typing in keywords, called Ad-words by Google, that reveal what they are looking for and search engines displaying highly targeted ads relevant to the specific  query.
The AdWords market is essentially a large auction where businesses place bids for individual keywords, together with limits specifying their maximum daily budget. The search engine company earns revenue from businesses when it displays their ads in response to a relevant search query (if the user actually clicks on the ad). Indeed, most of the revenues of search engine companies are derived in this manner.

It is well known that Internet companies, such as Amazon and eBay, are able to open up new markets by tapping into the fat tails of distributions. This holds for search engine companies as well: The advertising budgets of companies and organizations follows a power law distribution, and unlike conventional advertising, search engine companies are able to cater to low budget advertisers on the fat tail of this distribution. This is partially responsible for their dramatic success.The following computational problem, which we call the Adwords problem, was posed to us by Henzinger : assign user queries to advertisers to maximize the total revenue. Observe that the task is necessarily online { when returning results of a specific  query, the search engine company needs to immediately determine what ads to display on the side. The computational problem is specified as follows: each advertiser places bids on a number of keywords and specifies a maximum daily budget. As queries arrive during the day, they must be assigned to advertisers. 
The objective is to maximize the total revenue while respecting the daily budgets.

This market dwarfs the AdSense market where the ad is based on the actual contents of the website.According to a recent New York Times article (Feb 4, 2005), the revenue accrued by Google from this market in the last three months of 2004 alone was over a billion dollars.

In this article, we present a deterministic algorithm achieving a competitive ratio of 1-1/e for this problem, under the assumption that bids are small compared to budgets. The algorithm is simple and time efficient.
No randomized algorithm can achieve a better competitive ratio, even under this assumption of small bids.

how this Ad word advertisement works for search engines and its algorithm and analysis can be generalized to the following more realistic situations while still maintaining the same competitive ratio:
  •  A bidder pays only if the user clicks on his ad.
  •  Advertisers have different daily budgets.
  •  Instead of charging a bidder his actual bid, the search engine company charges him the next highest bid.
  •  Multiple ads can appear with the results of a query.
  • Advertisers enter at different times.

Previous work
The adwords problem is clearly a generalization of the online bipartite matching problem: the special case where each advertiser makes unit bids and has a unit daily budget is precisely the online matching problem. Even in this special case, the greedy algorithm achieves a competitive ratio of 1/2. The algorithm that allocates each query to a random interested advertiser does not do much better { it achieves a competitive ratio of 1/2 + O(log n/n)}.

In [KVV90], Karp, Vazirani and Vazirani gave a randomized algorithm for the online matching problem achieving a competitive ratio of 1- 1/e. Their algorithm, called RANKING, fixes a random permutation of the bidders in advance and breaks ties according to their ranking in this permutation. They further showed that no randomized online algorithm can achieve a better competitive ratio.

In another direction, Kalyanasundaram and Pruhs [KP00] considered the online b-matching problem which can be described as a special case of the adwords problem as follows: each advertiser has a daily budget of b dollars, but makes only 0/1 dollar bids on each query. Their online algorithm, called BALANCE, awards the query to that interested advertiser who has the highest unspent budget. They show that the competitive ratio of this algorithm tends to 1- 1/e as b tends to infnity. They also prove a lower bound of 
1 - 1/e for deterministic algorithms.

New idea
To generalize the algorithms of [KP00] to arbitrary bids, it is instructive to examine the special case with bids restricted to {0; 1; 2}. The natural algorithm to try assigns each query to a highest bidder, using the previous heuristic to break ties (largest remaining budget). We provide an example in the appendix to show that such an algorithm achieves a competitive ratios strictly smaller and bounded away from 1 - 1/e.

This indicates the need to consider a much more delicate tradeo between the bid versus unspent budget. The correct tradeo® function is derived by a novel LP-based approach, which we outline below. The resulting algorithm is very simple, and is based on the following tradeoff function:




The algorithm assumes that the daily budget of advertisers is large compared to their bids.

Subscribe for new innovation in this area.
 





Wednesday, April 1, 2015

Structured Data to Knowledge Graph and Knowledge Graph Identifcation,Bayes' Theorem and Theorem of total probability used as Predictive Algorithm.

Large-scale information processing systems are able to ex- tract massive collections of interrelated facts, but unfortunately trans- forming these candidate facts into useful knowledge is a formidable chal-
lenge. In this paper, we show how uncertain extractions about entities and their relations can be transformed into a knowledge graph. The ex-tractions form an extraction graph and we refer to the task of removing noise, inferring missing information, and determining which candidate facts should be included into a knowledge graph as knowledge graph identi cation. In order to perform this task, we must reason jointly about candidate facts and their associated extraction con dences, identify co- referent entities, and incorporate ontological constraints. Our proposed approach uses probabilistic soft logic (PSL), a recently introduced probabilistic modeling framework which easily scales to millions of facts. We
demonstrate the power of our method on a synthetic Linked Data corpus derived from the MusicBrainz music community and a real-world set of extractions from the NELL project containing over 1M extractions and 70K ontological relations. We show that compared to existing methods, our approach is able to achieve improved AUC and F1 with signi cantly lower running time.

The web is a vast repository of knowledge, but automatically extracting that knowledge at scale has proven to be a formidable challenge. Recent evaluation e orts have focused on automatic knowledge base population [1, 2], and many well-known broad domain and open information extraction systems exist, including the Never-Ending Language Learning (NELL) project , OpenIE , and e orts at Google , which use a variety of techniques to extract new knowledge, in the form of facts, from the web. These facts are interrelated, and hence,recently this extracted knowledge has been referred to as a knowledge graph .

A key challenge in producing the knowledge graph is incorporating noisy information from di erent sources in a consistent manner. Information extraction systems operate over many source documents, such as web pages, and use a collection of strategies to generate candidate facts from the documents, spanning syntactic, lexical and structural features of text. Ultimately, these extraction systems produce candidate facts that include a set of entities, attributes of these entities, and the relations between these entities which we refer to as the extraction graph.However errors in the extraction process introduce inconsistencies in the extraction graph, which may contain duplicate entities and violate key ontological constraints such as subsumption, mutual exclusion, inverse, domain and range constraints. Such noise obscures the true knowledge graph, which captures a consistent set of entities, attributes and relations.

Google work infers the knowledge graph from the extraction graph generated by an information extraction system.
We demonstrate that the errors encountered by information extraction systems require jointly reasoning over candidate facts to construct a consistent knowledge graph. Our approach performs entity resolution, collective classi cation and link prediction while also enforcing global constraints on the knowledge graph, a process which we refer to as knowledge graph identi cation.

In order to implement knowledge graph identi cation, we use probabilistic soft logic (PSL) , a recently introduced framework for reasoning probabilistically over continuously-valued random variables. PSL provides many advantages: models are easily de ned using declarative rules with rst-order logic syntax, continuously-valued variables provide a convenient representation of uncertainty,weighted rules and weight learning capture the importance of model rules, and advanced features such as set-based aggregates and hard constraints are supported. In addition, inference in PSL is a convex optimization that is highly scalable allowing us to handle millions of facts in minutes.

convex optimization is a mathematical concept,where in  3d System we define the quantity to be optimized along Z axis which is a function of a point in x-y plane,where x,y are the domain or parameters upon which Quantity to be optimized depends.
Mathematically it can be represented as
 Z=f(x,y) ,where z= the Quantity to be optimized
                       x,y=domain or variable upon which the Quantity Z depends.
We can extend this concept to n dimensional system too.

Here we develop a PSL model for knowledge graph identi cation that both captures probabilistic dependencies which will be analysed in the next paper between facts and enforces global constraints
between entities and relations. Through this model, we de ne a probability distribution over interpretations - or truth value assignments to facts - each of which corresponds to a possible knowledge graph. By performing inference using the extraction graph and an ontology, we are able to fi nd the most probable knowledge graph. We will try to establish the bene ts of our approach on two large datasets: a synthetic dataset derived from the MusicBrainz com and ontological relationships de ned in the Music Ontology as well as noisy extractions from NELL, a large-scale operational knowledge extraction system.
What you readers can try is
1) formulating the knowledge graph identi cation problem that supports reasoning about multiple, uncertain extractorsources in the presence of ontological constraints;
 2) solving knowledge graph identi cation e ciently with convex optimization using PSL; and
 3) demonstrating the power of knowledge graph identi cation by presenting results on benchmark datasets that are superior to state-of-the-art methods and generating massive knowledge graphs on the scale of minutes that are infeasible to compute in competing systems.


What is the Motivation behind Knowledge Graph Identi cation
 we represent the candidate facts from an information extraction system as a knowledge graph where entities are nodes, categories are labels associated with each node, and relations are directed edges between the nodes. Information extraction systems can extract such candidate facts, and these extractions
can be used to construct an extraction graph. Unfortunately, the extraction graph is often incorrect, with errors such as spurious and missing nodes and edges, and missing or inaccurate node labels. Our approach, knowledge graph identi cation (KGI) combines the tasks of entity resolution, collective classi cation
and link prediction mediated by rules based on ontological information.

From Structured Data to the Knowledge Graph
What actually transformed a Structured Data to the Knowledge Graph.
To understand this we require to understand the following terms,that can be of immense utility for web developers to introduce structured data into their HTML code and thus helping to produce Snippets which we call as brief introductory piece of information for a customer from search engine output corresponding to customer queries typed in search engine.

Agenda
● The end of search as we know it
● Knowledge Graph
● Schema.org and markup
● Developer tools
● What the future holds


If you type in the search engine any search queries the intelligent search engine does the following
  1. Answer your queries according to text entered.
  2. Anticipate from the text you have entered.
How search engine can Understand the world.Can it See the world through some sensors like camera,can it sense the environment through some sensors like temperature,humidity,sound etc sensors ,can it automatically learn environment ,its people and place information through some sensors and robot programs?

Knowledge Graph about an actor can look like this having a lot of nodes connected directly or indirectly to the central node that is the actor here.
Knowledge Graph helps answer user's queries,the text you enter in the search engine can give you direct answers and will also run prediction algorithm based on the text you have entered and history of that computer searches ,this pridiction algorithm is based on Bayes' Theorem.

Bayes' Theorem and Partition of a sample space
Consider that there are two bags I and II. Bag I contains 2 white and 3 red balls and Bag II contains 4 white and 5 red balls. One ball is drawn at random from one of the bags. We can find the probability of selecting any of the bags (i.e.12 ) or probability of drawing a ball of a particular colour (say white) from a particular bag (say Bag I). In other words, we can find the probability that the ball drawn is of a particular colour, if
we are given the bag from which the ball is drawn. But, can we find the probability that the ball drawn is from a particular bag (say Bag II), if the colour of the ball drawn is given? Here, we have to find the reverse probability of Bag II to be selected when an event occurred after it is known. Famous mathematician, John Bayes' solved the problem of finding reverse probability by using conditional probability. The formula developed by him is known as ‘Bayes theorem’ which was published posthumously in 1763.
Before stating and proving the Bayes' theorem, let us first take up a definition and some preliminary results.

Partition of a sample space


A set of events E1, E2, ..., En is said to represent a partition of the sample space S if
(a) Ei ∩ Ej = φ, i ≠ j, i, j = 1, 2, 3, ..., n
(b) E1 ∪ Ε2 ∪ ... ∪ En= S and
(c) P(Ei) > 0 for all i = 1, 2, ..., n.

In other words, the events E1, E2, ..., En represent a partition of the sample space S if they are pairwise disjoint, exhaustive and have nonzero probabilities.As an example, we see that any nonempty event E and its complement E′ form a partition of the sample space S since they satisfy E ∩ E′ = φ and E ∪ E′ = S.From the Venn diagram in Fig 13.3, one can easily observe that if E and F are any two events associated with a sample space S, then the set {E ∩ F′, E ∩ F, E′ ∩ F, E′ ∩ F′}is a partition of the sample space S. It may be mentioned that the partition of a sample space is not unique. There can be several partitions of the same sample space.We shall now prove a theorem known as Theorem of total probability.

Theorem of total probability
Let {E1, E2,...,En} be a partition of the sample space S, and suppose that each of the
events E1, E2,..., En has nonzero probability of occurrence. Let A be any event associated
with S, then


 Proof: Given that E1, E2,..., En is a partition of the sample space S (Fig below). Therefore
S = E1 ∪ E2 ∪ ... ∪ En   From the figure above

and Ei ∩ Ej = φ, i ≠ j, i, j = 1, 2, ..., n

Now, we know that for any event A,
 A = A ∩ S
     = A ∩ (E1 ∪ E2 ∪ ... ∪ En)
     = (A ∩ E1) ∪ (A ∩ E2) ∪ ...∪ (A ∩ En)
Also A ∩ Ei and A ∩ Ej are respectively the subsets of Ei and Ej . We know that
Ei and Ej are disjoint, for i ≠ j , therefore, A ∩ Ei and A ∩ Ej are also disjoint for all i ≠ j, i, j = 1, 2, ..., n.

Thus, P(A) = P [(A ∩ E1) ∪ (A ∩ E2)∪ .....∪ (A ∩ En)]
                  = P (A ∩ E1) + P (A ∩ E2) + ... + P (A ∩ En)

Now, by multiplication rule of probability, we have
P(A ∩ Ei) = P(Ei) P(A|Ei) as P (Ei) ≠ 0∀i = 1,2,..., n

Therefore, P (A) = P (E1) P (A|E1) + P (E2) P (A|E2) + ... + P (En)P(A|En)

Or


 Let's come back to the topic of Knowledge Graph discussion
Google Knowledge Graph helps answer user's queries as shown in the following Figure
 know what things exist against the text you type in the search engine...........

 summarize relevant facts about those things as shown below ... 





discover related things of interest as shown below........


So you must have realized that Google is a pioneer in the field of information structuring and management and providing you access to the information that is useful to you and that information is having utility for you in the best possible way.It also learns your interest based on the Queries you type in Google search engine and make your search more fruitful by showing information based on your search history and your interest using Some optimized predictive Algorithms.

I will come with more Advertisement Algorithms in the future if you have any query ask and Subscribe to my page to get the Latest updates.

To read my more pages click the link below









Thursday, March 26, 2015

Artificial intelligence and LAO*: A heuristic search algorithm that finds solutions with loops,Heuristic search in AND/OR graphs,Search Engine Optimization

Classic heuristic search algorithms can find solutions that take the form of a simple path (A*), a
tree, or an acyclic graph (AO*).
Heuristic search, called LAO*, that can find solutions with loops.
One of the most widely-used frameworks for problem-solving in artificial intelligence is state-space search. A state-space search problem is defined by a set of states, a set of actions (or operators) that map states to successor states, a start state, and a set of goal states. The objective is to find a sequence of actions that transforms the start state into a goal state, and also optimizes some measure of the quality of the solution.
Two well-known heuristic search algorithms for state-space search problems are A* and
AO*
.A* finds a solution that takes the form of a sequence of actions leading in a path from the start state to a goal state. AO* finds a solution that has a conditional structure and takes the form of a tree, or more generally, an acyclic graph. But no heuristic search algorithm has been developed that can find a solution that takes the form of a cyclic graph, that is, a solution with loops. For many problems that can be formalized in the state-space search model, it does not make sense for a solution to contain loops. A loop in a solution to a theorem-proving problem represents circular reasoning. A loop in a solution to a problem-reduction task represents a failed reduction to primitive subproblems. However, there is an important class of problems for which it does make sense for a solution to contain loops. These problems can be formalized as Markov decision processes, a framework that is widely used in artificial intelligence for problems of planning and learning under uncertainty.

A Markov decision process (MDP) models problems of sequential decision making that include actions that transform a state into one of several possible successor states, with each possible state transition occurring with some probability. A solution to an MDP takes the form of a mapping from states to actions called a policy. A policy is executed by observing the current state and taking the action prescribed for it. A solution represented in this way implicitly contains both branches and loops. Branching is present because the state that stochastically results from an action determines the next action. Looping is present because the same state may be revisited under a policy. (As an example of a plan with a conditional loop, consider an action that has its desired effect with probability less than one and otherwise has no effect. An appropriate plan might be to repeat the action until it “succeeds”.)
So the question is what is Markov decision process and how it can be represented mathematically.
Markov decision processes (MDPs), named after Andrey Markov, provide a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying a wide range of optimization problems solved via dynamic programming and reinforcement learning. MDPs were known at least as early as the 1950s (cf. Bellman 1957). A core body of research on Markov decision processes resulted from Ronald A. Howard's book published in 1960, Dynamic Programming and Markov Processes. They are used in a wide area of disciplines, including robotics, automated control, economics, and manufacturing. More precisely, a Markov Decision Process is a discrete time stochastic control process. At each time step, the process is in some state s, and the decision maker may choose any action a that is available in state s. The process responds at the next time step by randomly moving into a new state s', and giving the decision maker a corresponding reward R_a(s,s').The probability that the process moves into its new state s' is influenced by the chosen action. Specifically, it is given by the state transition function P_a(s,s'). Thus, the next state s' depends on the current state s and the decision maker's action a. But given s and a, it is conditionally independent of all previous states and actions; in other words, the state transitions of an MDP process satisfies the Markov property.
Markov decision processes are an extension of Markov chains; the difference is the addition of actions (allowing choice) and rewards (giving motivation). Conversely, if only one action exists for each state and all rewards are the same (e.g., zero), a Markov decision process reduces to a Markov chain An optimal policy can be found using a dynamic programming algorithm such as policy iteration or value iteration. But a disadvantage of dynamic programming is that it evaluates the entire state space. In effect, it finds a policy for every possible starting state. By contrast, heuristic search algorithms solve a problem for a particular starting state and use an admissible heuristic to focus the search, and remove from consideration regions of the state space that can’t be reached from the start state by an optimal solution. For problems with large state spaces, heuristic search has an advantage over dynamic programming because it can find an optimal solution for a start state without evaluating
the entire state space... This advantage is well-known for problems that can be solved by A* or AO*. In fact, an important theorem about the behavior of A* is that (under certain conditions) it evaluates
the minimal number of states among all algorithms that find an optimal solution using the
same heuristic . A related result has also been established for AO* . In this page, we generalize the heuristic search approach to find solutions with loops and show that the resulting algorithm, which we call LAO*, can solve planning problems that are formalized as MDPs.
Dynamic programming algorithms for MDPs and heuristic search algorithms for state-space search problems, and discusses their relationship.
Markov decision processes:
We consider an MDP with a finite set of states, S. For each state i ∈ S, let A(i) denote a finite set of actions available in that state. Let pij (a) denote the probability that taking action a in state i results in a transition to state j. Let ci(a) denote the expected immediate cost of taking action a in state i .We focus on a special class of MDPs called stochastic shortest-path problems because it generalizes traditional shortest-path problems for which artificial intelligence (AI) search algorithms have been developed. (The name “shortest-path” reflects an interpretation of action costs as arc lengths.) A stochastic shortest-path problem is an MDP with a set of terminal states, T ⊆ S. For every terminal state i ∈ T , no action can cause a transition out of this state (i.e., it is an absorbing state) and the immediate cost of any action taken in this state is zero. Formally, pii(a) = 1 and ci(a) = 0 for all i ∈ T and a ∈ A(i).For all other states, immediate costs are assumed to be positive for every action, that is, ci(a) > 0 for all states i /∈ T and actions a ∈ A(i). The objective is to reach a terminal state while incurring minimum expected cost. Thus, we can think of terminal states as goal states and we will use the terms “terminal state” and “goal state” interchangeably from now on. Because the probabilistic outcomes of actions can create a nonzero probability of revisiting the same state, the worst-case number of steps needed to reach a goal state cannot be bounded. Hence, stochastic shortest-path problems are said to have an indefinite horizon. A solution to an indefinite-horizon MDP can be represented as a stationary mapping from states to actions, π : S→A, called a policy. A policy is said to be proper if, for every state, it ensures that a goal state is reached with probability 1.0. For a proper policy, the expected cost for reaching a goal state from each state i ∈ S is finite and can be computed by solving the following system of |S| linear equations in |S| unknowns
We call f π the evaluation function of policy π.We assume that the stochastic shortest-path problems we consider have at least one proper policy, and that every improper policy has infinite expected cost for at least one state. This assumption is made by Bertsekas in developing the theory of stochastic shortest-path problems. It generalizes the assumption for deterministic shortest-path problems that there is a path to the goal state from every other state, and all cycles have positive cost.A policy π is said to dominate a policy π if f π (i) f π(i) for every state i . An optimal policy, π∗, dominates every other policy, and its evaluation function, f∗, satisfies the following system of |S| nonlinear equations in |S| unknowns, called the Bellman optimality equation:
Dynamic programming algorithms for MDPs find the evaluation function that satisfies the Bellman equation by successively improving an estimated evaluation function, f , by performing backups. For each state i , a backup takes the following form
Performing a backup for every state i in the state set is referred to as a dynamic programming update. It is the core step of dynamic-programming algorithms for solving MDPs. The fact that a backup is performed for every state in the state set is characteristic of dynamic programming, which evaluates all problem states.
There are two related dynamic programming algorithms for indefinite-horizon MDPs: policy iteration and value iteration. Policy iteration is summarized in Table 1. It interleaves the dynamic-programming update, used for policy improvement, with policy evaluation. When the current policy π is not optimal, the dynamic-programming update finds a new policy π such that f π(i) f π (i) for every state i and f π(i) < f π (i) for at least one state i . Because the number of possible policies is finite and policy iteration improves the current policy each iteration, it converges to an optimal policy after a finite number of iterations.
Heuristic search in AND/OR graphs
Like an MDP, a state-space search problem is defined by a set of states (including a start state and a set of goal states), a set of actions that cause state transitions, and a cost function that specifies costs for state transitions. The objective is to find a minimal-cost path from the start state to a goal state. It is customary to formalize a state-space search problem as a graph in which each node of the graph represents a problem state and each arc represents a state transition that results from taking an action.We use the same notation for state-space search problems as for MDPs. Let S denote the set of possible states, let s ∈ S denote a start state that corresponds to the root of the graph, and let T ⊂ S denote a set of goal (or terminal) states that occur at the leaves of the graph. Let A denote a finite set of actions and let A(i) denote the set of actions applicable in state i . The most general state-space search problem considered in the AI literature is AND/OR graph search. Following Martelli and Montanari  and Nilsson , we define an AND/OR graph as a hypergraph. Instead of arcs that connect pairs of states as in an

Fig.  (a) shows an OR node with two arcs leaving it, one for action a1 and one for action a2. Each arc leads to an AND node with two successor OR nodes, one for each possible successor state. (By convention, a square denotes an OR node and a circle denotes an AND node. In the terminology of decision analysis, a square corresponds to a choice node and a circle corresponds to a chance node.) (b) shows a state, indicated by a circle, with two 2-connectors leaving it, one for action a1 and one for action a2. Each 2-connector leads to two possible successor states. The representation on the right, using state nodes and k-connectors, is equivalent to the representation on the left, which uses OR and AND nodes. ordinary graph, a hypergraph has hyperarcs or k-connectors that connect a state to a set
of k successor states. Fig. above  relates the concept of OR and AND nodes to the concept of a
hyperarc or k-connector. A k-connector can be interpreted in different ways. In problem-reduction search, it is interpreted as the transformation of a problem into k subproblems.We consider problems
of planning under uncertainty in which it is interpreted as an action with an uncertain outcome. The action transforms a state into one of k possible successor states, with a probability attached to each successor. We let pij (a) denote the probability that taking action a in state i results in a transition to state j . This interpretation of AND/OR graphs is made by Martelli and Montanari [14] and Pattipati and Alexandridis , among others. In AND/OR graph search, a solution takes the form of an acyclic subgraph called a solution graph, which is defined as follows:
• the start state belongs to a solution graph;
• for every nongoal state in a solution graph, exactly one outgoing k-connector (corresponding to an action) is part of the solution graph and each of its successor states also belongs to the solution graph;
• every directed path in the solution graph terminates at a goal state. A cost function assigns a cost to each k-connector. Let ci(a) denote the cost for the kconnector that corresponds to taking action a in state i . We assume each goal state has a cost of zero. A minimal-cost solution graph is found by solving the following system of equations,
where f∗(i) denotes the cost of an optimal solution for state i and f∗ is called the optimal evaluation function. Note that this is identical to the optimality equation defined for MDPs in Eq. (2), although we now assume the solution can be represented as an acyclic graph. This assumption means that a goal state can be reached from the start state after a bounded number of actions (where the bound is equal to the longest path from the start state to a goal state). By contrast, a stochastic shortest-path problem has an unbounded or indefinite horizon. Thus the traditional framework of AND/OR graph search does not directly apply to stochastic shortest-path problems.

Search Engine Optimization
Search engine optimization is often about making small modifications to parts of your website. When viewed individually, these changes might seem like incremental improvements, but when combined with other optimizations, they could have a noticeable impact on your site's user experience and performance in organic search results. You're likely already familiar with many of the topics in this guide, because they're essential ingredients for any web page, but you may not be making the most out of them.

Create unique, accurate page titles
A title tag tells both users and search engines what the topic of a particular page is. The <title> tag should be placed within the <head> tag of the HTML document (1). Ideally, you should create a
unique title for each page on your site.
(2) A user performs the query [baseball cards]. Our homepage shows up as a result, with the title listed on the first line (notice that the query terms the user searched for appear in bold).
(3) A user performs the query [rarest baseball cards]. A relevant, deeper page (its
title is unique to the content of the page) on our site appears as a result.
Page title contents are displayed in search results
If your document appears in a search results page, the contents of the title tag will usually appear in the first line of the results (if you're unfamiliar with the different parts of a Google search result,
you might want to check out the anatomy of a search result video by Google engineer Matt Cutts, and this helpful diagram of a Google search results page). Words in the title are bolded if they appear in the user's search query. This can help users recognize if the page is likely to be relevant to their search (2). The title for your homepage can list the name of your website/ business and could include other bits of important information like the physical location of the business or maybe a few of its main focuses or offerings (3).
AO* For state-space search problems that are formalized as AND/OR graphs, an optimal solution graph can be found using the heuristic search algorithm AO*. Nilsson [16,17] first described a version of AO* for searching AND/OR trees to find a solution in the form of a tree. Martelli and Montanari [13,14] generalized this algorithm for searching AND/OR graphs to find a solution in the form of an acyclic graph. Nilsson [18] also used the name AO* for this more general algorithm. Because any acyclic graph can be unfolded into an equivalent tree, the tree-search and graph-search versions of AO* solve the same class of problems. The graph-search version is more efficient when the same state can be reached along different paths because it avoids performing duplicate searches. Like other heuristic search algorithms, AO* can find an optimal solution without considering every problem state. Therefore, a graph is not usually supplied explicitly to the search algorithm. An implicit graph, G, is specified implicitly by a start state s and a successor function. The search algorithm constructs an explicit graph, G , that initially consists only of the start state. A tip or leaf state of the explicit graph is said to be terminal if it is a goal state; otherwise, it is said to be nonterminal. A nonterminal tip state can be expanded by adding to the explicit graph its outgoing k-connectors (one for each action) and any successor states not already in the explicit graph.

AO* solves a state-space search problem by gradually building a solution graph, beginning from the start state. A partial solution graph is defined similarly to a solution graph, with the difference that the tip states of a partial solution graph may be nonterminal states of the implicit AND/OR graph. A partial solution graph is defined as follows:
• the start state belongs to a partial solution graph;
• for every nontip state in a partial solution graph, exactly one outgoing k-connector (corresponding to an action) is part of the partial solution graph and each of its successor states also belongs to the partial solution graph;
• every directed path in a partial solution graph terminates at a tip state of the explicit graph. As with solution graphs, there are many possible partial solution graphs and an evaluation function can be used to rank them. The cost of a partial solution graph is defined similarly to the cost of a solution graph. The difference is that if a tip state of a partial solution graph is nonterminal, it does not have a cost that can be propagated backwards. Instead, we assume there is an admissible heuristic estimate h(i) of the minimal-cost solution graph for state i . A heuristic evaluation function h is said to be admissible if h(i) f ∗(i) for every state i .We can recursively calculate an admissible heuristic estimate f (i) of the optimal cost of any state i in the explicit graph as follows:

The best partial solution graph can be determined at any time by propagating heuristic estimates from the tip states of the explicit graph to the start state. If we mark the action AO*
1. The explicit graph G  initially consists of the start state s.
2. While the best solution graph has some nonterminal tip state: (a) Expand best partial solution: Expand some nonterminal tip state n of the best partial solution graph and add any new successor states to G . For each new state i added to G by expanding n, if i is a goal state then f (i) := 0; else f (i) := h(i). (b) Update state costs and mark best actions: i. Create a set Z that contains the expanded state and all of its ancestors in the explicit graph along marked action arcs. (I.e., only include ancestor states from which the expanded state can be reached by following the current best solution.)
ii. Repeat the following steps until Z is empty. A. Remove from Z a state i such that no descendent of i in G occurs in Z. B. Set f (i) := mina∈A(i) [ci (a) + j pij (a)f (j )] and mark the best action for i. (When determining the best action resolve ties arbitrarily, but give preference to the currently marked
action.) 3. Return an optimal solution graph. that minimizes the value of each state, the best partial solution graph can be determined by starting at the root of the graph and selecting the best action for each reachable state. Table  outlines the algorithm AO* for finding a minimal-cost solution graph in an acyclic AND/OR graph. It interleaves forward expansion of the best partial solution with
a cost revision step that updates estimated state costs and the best partial solution. In the simplest version of AO*, the costs of the expanded state and all of its ancestor states in the explicit graph are updated . However, Martelli and Montanari and Nilsson note that the only ancestor states that may have their costs change are those from which the expanded state can be reached by taking marked actions (i.e., by choosing the best action for each state). Thus, the parenthetical remark in step 2(b)i of Table 4 indicates that a parent i of state j is not added to Z unless both the estimated cost of state j has changed, and state j can be reached from state i by choosing the best action for state i . To simplify exposition, we have omitted from our outline of AO* a solve-labeling procedure that is often included in the algorithm to improve efficiency. Briefly, a state is labeled solved if it is a goal state or if all of its successor states are labeled solved. Labeling states as solved can improve the efficiency of the solution expansion step because it is unnecessary to search below a solved state for nonterminal tip states. We also note that the best partial solution graph may have many nonterminal tip states. AO* works correctly no matter which of these tip states is chosen for expansion. However, the efficiency of AO* can be improved by using a good selection function to choose which nonterminal tip state of the best partial solution graph to expand next. Possibilities include selecting the tip state with the least estimated cost, or selecting the tip state with greatest probability of being reached. Although AND/OR graphs are usually assumed to be acyclic, the possibility of searching cyclic AND/OR graphs has been studied. In previous work, however, the solution graph is assumed to be acyclic