#include <tokenconfiguration.h>
Public Member Functions | |
TokenConfiguration (ProjectionTree *pt) | |
Constructor. | |
TokenConfiguration (TokenConfiguration *parent) | |
Constructor. | |
virtual | ~TokenConfiguration () |
void | createRoleList (vector< Role * > &roles, vector< unsigned > &role_counts) |
Extracts the role list described by this configuration. | |
TokenConfiguration * | applyTag (TAG t) |
Compute the configuration when that results when moving tokens along tag t. | |
TokenConfiguration * | applyText () |
Compute the configuration that results when reading PCDATA content. | |
void | addPassiveTokens (unsigned i, unsigned n) |
Adds tokens to the passive tokens count. | |
void | print (OutputStream &dos, bool is_text=false) |
Debug prints the token configuration. | |
vector< unsigned > | getActiveTokens () |
Returns the active token vector. | |
void | setActiveTokens (unsigned i, unsigned n) |
Sets the active token count. | |
vector< unsigned > | getPassiveTokens () |
Returns the passive token vector. | |
void | setPassiveTokens (unsigned i, unsigned n) |
Sets the passive token count. | |
unsigned | getActiveTokensById (unsigned i) |
Returns the number of active tokens at a given position. | |
unsigned | getPassiveTokensById (unsigned i) |
Returns the number of passive tokens at a given position. | |
unsigned | getLastActiveTokenCountFor (unsigned token_id) |
Lookup the active token count in the last config where the token appears. | |
ProjectionTreeLabels * | getLabels () |
Returns a pointer to the ProjectionTreeLabels object. | |
bool | isEmpty () |
Checks for token configuration being empty. | |
bool | isOutput () |
Returns true if the configuration describes a node that will be written out. | |
bool | hasActiveToken () |
Checks for active tokens in the current configurarion. | |
unsigned | sumUpActiveTokenCountFor (unsigned token_id) |
Sums up the active token count in this and ancestor configurations. | |
bool | forceChildKeep () |
Returns true if we are forced to keep all subsequent child tags. | |
bool | keepSubtree () |
Returns true if the configuration forces us to keep the whole subtree beneath matching document nodes. | |
bool | skipSubtree () |
Returns true if the configuration allows us to discard the whole subtree. | |
Private Attributes | |
TokenConfiguration * | parent |
The parent configuration. | |
ProjectionTreeLabels * | labels |
The label dictionary. | |
vector< unsigned > | active_tokens |
One component of a TokenConfiguration are active tokens. | |
vector< unsigned > | passive_tokens |
The second component of a TokenConfiguration are the passive tokens. |
A token configuration touroughly describes a Projection DFA state. Each token configuration is characterized by a multiset of tokens that are places at path steps in the ProjectionTree. To this end, the projection tree is labeled (each path step in the tree gets assigned a unique label), and tokens are placed on top of these labels. We always distinguish between active and passive tokens, i.e. tokens describing labels that are currently active, and tokens that describe previously matched labels (which are important to keep track of descendant axes).
Definition at line 57 of file tokenconfiguration.h.
TokenConfiguration::TokenConfiguration | ( | ProjectionTree * | pt | ) |
Constructor.
Constructor - creating object with one button placed at the root node.
[in] | pt | Pointer to a ProjectionTree object. |
Definition at line 37 of file tokenconfiguration.cpp.
References active_tokens, ProjectionTreeLabels::getAllRecursiveDosNodeSuccessors(), labels, ProjectionTreeLabels::nrOfLabels(), and passive_tokens.
Referenced by applyTag(), and applyText().
TokenConfiguration::TokenConfiguration | ( | TokenConfiguration * | parent | ) |
Constructor.
Constructor - creating object using the parent's passive tokens as a starting point.
[in] | parent | Pointer to a TokenConfiguration object. |
Definition at line 57 of file tokenconfiguration.cpp.
References active_tokens, labels, and ProjectionTreeLabels::nrOfLabels().
TokenConfiguration::~TokenConfiguration | ( | ) | [virtual] |
Destructor.
Definition at line 65 of file tokenconfiguration.cpp.
void TokenConfiguration::addPassiveTokens | ( | unsigned | i, | |
unsigned | n | |||
) | [inline] |
Adds tokens to the passive tokens count.
Adds n tokens to the passive tokens count at index i.
[in] | i | The token index. |
[in] | n | Number of tokens to be added. |
void |
Definition at line 112 of file tokenconfiguration.h.
References passive_tokens.
Referenced by applyTag().
TokenConfiguration * TokenConfiguration::applyTag | ( | TAG | t | ) |
Compute the configuration when that results when moving tokens along tag t.
Creates a fresh TokenConfiguration object.
[in] | t | The tag that is applied to the current TokenConfiguration. |
TokenConfiguration* | The resulting token configuration. |
Definition at line 112 of file tokenconfiguration.cpp.
References active_tokens, addPassiveTokens(), ProjectionTreeLabel::descendantAxisBetw(), PathStepExpression::getAxisType(), ProjectionTreeLabel::getChildSuccessors(), ProjectionTreeLabel::getFSALabel(), ProjectionTreeLabel::getId(), ProjectionTreeLabels::getLabelById(), getLastActiveTokenCountFor(), ProjectionTreeLabel::getLeftmostSLPredecessor(), ProjectionTreeLabel::getPath(), ProjectionTreeLabel::getPathStep(), ProjectionTreeLabel::getPredecessor(), ProjectionTreeLabel::getSameLevelSuccessor(), ProjectionTreeLabel::isChildLabel(), ProjectionTreeLabel::isDescendantLabel(), ProjectionTreeLabel::isDosNodeLabel(), ProjectionTreeLabel::isDosOrDescendantLabel(), isEmpty(), labels, ProjectionTreeLabel::matchesTag(), passive_tokens, setActiveTokens(), sumUpActiveTokenCountFor(), and TokenConfiguration().
Referenced by ProjectionDFATransitions::computeTransition().
TokenConfiguration * TokenConfiguration::applyText | ( | ) |
Compute the configuration that results when reading PCDATA content.
Creates a fresh TokenConfiguration object.
TokenConfiguration* | The resulting token configuration. |
Definition at line 296 of file tokenconfiguration.cpp.
References active_tokens, ProjectionTreeLabel::descendantAxisBetw(), PathStepExpression::getAxisType(), ProjectionTreeLabel::getChildSuccessors(), ProjectionTreeLabel::getFSALabel(), ProjectionTreeLabel::getId(), ProjectionTreeLabels::getLabelById(), getLastActiveTokenCountFor(), ProjectionTreeLabel::getLeftmostSLPredecessor(), ProjectionTreeLabel::getPath(), ProjectionTreeLabel::getPathStep(), ProjectionTreeLabel::getPredecessor(), ProjectionTreeLabel::getSameLevelSuccessor(), hasActiveToken(), ProjectionTreeLabel::isChildLabel(), ProjectionTreeLabel::isDescendantLabel(), ProjectionTreeLabel::isDosNodeLabel(), labels, ProjectionTreeLabel::matchesText(), passive_tokens, setActiveTokens(), sumUpActiveTokenCountFor(), and TokenConfiguration().
Referenced by ProjectionDFATransitions::computeTextTransition().
void TokenConfiguration::createRoleList | ( | vector< Role * > & | roles, | |
vector< unsigned > & | role_counts | |||
) |
Extracts the role list described by this configuration.
The role list is computed from the roles that are associated with the projection tree labels identified by this configuration.
[in] | roles | The vector to store the roles in (pass-by-reference) |
[in] | role_counts | The vector to store the associated counts in (pass-by-reference) |
void |
Definition at line 68 of file tokenconfiguration.cpp.
References active_tokens, ProjectionTreeLabel::atEndOfPath(), ProjectionTreeLabel::getId(), ProjectionTreeLabels::getLabelById(), ProjectionTreeLabel::getPredecessor(), ProjectionTreeLabel::getProjectionTreeNode(), ProjectionTreeNode::getRole(), ProjectionTreeLabel::getSelfSuccessors(), ProjectionTreeLabel::isDosNodeLabel(), and labels.
Referenced by ProjectionDFAState::update().
bool TokenConfiguration::forceChildKeep | ( | ) |
Returns true if we are forced to keep all subsequent child tags.
The method implements the descendant-child axis clash and covers a special case where we are forced to keep child nodes although these child nodes are not immediately matched by the projection tree.
bool | True if a child-descendant conflict exists and child nodes must not be discarded. |
Definition at line 528 of file tokenconfiguration.cpp.
References active_tokens, ProjectionTreeLabel::getChildSuccessors(), ProjectionTreeLabels::getLabelById(), ProjectionTreeLabel::getSameLevelSuccessor(), ProjectionTreeLabel::getTag(), ProjectionTreeLabel::isChildLabel(), ProjectionTreeLabel::isDescendantLabel(), ProjectionTreeLabel::isNodeLabel(), ProjectionTreeLabel::isStarLabel(), labels, ProjectionTreeLabel::matchesText(), and passive_tokens.
Referenced by ProjectionDFAState::update().
vector< unsigned > TokenConfiguration::getActiveTokens | ( | ) | [inline] |
Returns the active token vector.
Returns a copy of the active_token vector member variable.
vector<unsigne> | A copy of the active tokens vector. |
Definition at line 128 of file tokenconfiguration.h.
References active_tokens.
unsigned TokenConfiguration::getActiveTokensById | ( | unsigned | i | ) | [inline] |
Returns the number of active tokens at a given position.
Returns the number of active tokens in the current configuration at a given position.
[in] | i | The index position. |
unsigned | The number of active tokens placed at position i. |
Definition at line 170 of file tokenconfiguration.h.
References active_tokens.
ProjectionTreeLabels * TokenConfiguration::getLabels | ( | ) | [inline] |
Returns a pointer to the ProjectionTreeLabels object.
Returns the associated ProjectionTreeLabels object, which contains a list of all labels.
ProjectionTreeLabels* | Pointer to the ProjectionTreeLabels object. |
Definition at line 204 of file tokenconfiguration.h.
References labels.
unsigned TokenConfiguration::getLastActiveTokenCountFor | ( | unsigned | token_id | ) |
Lookup the active token count in the last config where the token appears.
Recursively investigates the parent token configuration and returns the number of tokens with the given token_id in the last configuration that contains more than zero active tokens with the given id .
[in] | token_id | The id of the token count lookup is performed for. |
unsigned | The respective number of active tokens. |
Throws | RuntimeException if the TokenConfiguration is invalid, i.e. the requested token has never been used as a active token in any ancestor configuration. |
Definition at line 477 of file tokenconfiguration.cpp.
References active_tokens, getLastActiveTokenCountFor(), and parent.
Referenced by applyTag(), applyText(), and getLastActiveTokenCountFor().
vector< unsigned > TokenConfiguration::getPassiveTokens | ( | ) | [inline] |
Returns the passive token vector.
Returns a copy of the passive_token vector member variable.
vector<unsigne> | A copy of the passive tokens vector. |
Definition at line 148 of file tokenconfiguration.h.
References passive_tokens.
unsigned TokenConfiguration::getPassiveTokensById | ( | unsigned | i | ) | [inline] |
Returns the number of passive tokens at a given position.
Returns the number of passive tokens at a given position by token ID.
[in] | i | The token ID. |
unsigned | The number of active tokens placed at position i. |
Definition at line 181 of file tokenconfiguration.h.
References passive_tokens.
bool TokenConfiguration::hasActiveToken | ( | ) |
Checks for active tokens in the current configurarion.
Returns true if the configuration contains at least one active token, false otherwise.
bool | True if the configuration contains at least one active token, false otherwise. |
Definition at line 511 of file tokenconfiguration.cpp.
References active_tokens.
Referenced by applyText(), and ProjectionDFAState::update().
bool TokenConfiguration::isEmpty | ( | ) |
Checks for token configuration being empty.
A token configuration is defined to be empty if there are no active and no passive tokens.
bool | True if the current TokenConfiguration is empty, false otherwise. |
Definition at line 491 of file tokenconfiguration.cpp.
References active_tokens, and passive_tokens.
Referenced by applyTag(), and skipSubtree().
bool TokenConfiguration::isOutput | ( | ) |
Returns true if the configuration describes a node that will be written out.
Returns true if the configuration describes a node that will be written out, i.e. if all descendants of the node must be kept.
bool | True if the TokenConfiguration is an output configuration, false otherwise. |
Definition at line 499 of file tokenconfiguration.cpp.
References active_tokens, ProjectionTreeLabels::getLabelById(), ProjectionTreeLabel::isDosNodeLabel(), and labels.
Referenced by ProjectionDFAState::update().
bool TokenConfiguration::keepSubtree | ( | ) |
Returns true if the configuration forces us to keep the whole subtree beneath matching document nodes.
This case applies if we are in one or more dos::node() states and no other states (s.t. no role information must be computed for subnodes).
bool | True if the subtree is kept without any additional role information. |
Definition at line 615 of file tokenconfiguration.cpp.
References active_tokens, ProjectionTreeLabels::getLabelById(), ProjectionTreeLabel::isDosNodeLabel(), labels, and passive_tokens.
Referenced by ProjectionDFAState::update().
void TokenConfiguration::print | ( | OutputStream & | dos, | |
bool | is_text = false | |||
) |
Debug prints the token configuration.
This message is for debugging purpose only.
[in] | dos | Reference to the (debug) OutputStream. |
[in] | is_text | True if the configuration is a text configuration. |
void |
Definition at line 458 of file tokenconfiguration.cpp.
References active_tokens, and passive_tokens.
Referenced by ProjectionDFAState::print().
void TokenConfiguration::setActiveTokens | ( | unsigned | i, | |
unsigned | n | |||
) | [inline] |
Sets the active token count.
Sets the active token count at index i to value n.
[in] | i | The token index. |
[in] | n | Specified number. |
void |
Definition at line 139 of file tokenconfiguration.h.
References active_tokens.
Referenced by applyTag(), and applyText().
void TokenConfiguration::setPassiveTokens | ( | unsigned | i, | |
unsigned | n | |||
) | [inline] |
Sets the passive token count.
Sets the passive token count at index i to value n
[in] | i | The token index. |
[in] | n | Specified number. |
void |
Definition at line 159 of file tokenconfiguration.h.
References passive_tokens.
bool TokenConfiguration::skipSubtree | ( | ) |
Returns true if the configuration allows us to discard the whole subtree.
This case applies if the current TokenConfiguration is empty.
bool | True if the subtree can be discarded, false otherwise. |
Definition at line 635 of file tokenconfiguration.cpp.
References isEmpty().
Referenced by ProjectionDFAState::update().
unsigned TokenConfiguration::sumUpActiveTokenCountFor | ( | unsigned | token_id | ) |
Sums up the active token count in this and ancestor configurations.
Recursively sums up the active tokens of the current TokenConfiguration and all ancestor configurations for a given token.
[in] | token_id | The id of the token the sum is to be computed for. |
unsigned | The aggregated number of active token. |
Definition at line 519 of file tokenconfiguration.cpp.
References active_tokens, parent, and sumUpActiveTokenCountFor().
Referenced by applyTag(), applyText(), and sumUpActiveTokenCountFor().
vector< unsigned > TokenConfiguration::active_tokens [private] |
One component of a TokenConfiguration are active tokens.
The set of active tokens describes the active states in the projection tree (i.e. the ProjectionTreeLabels) currently visited. The size of this vector equals to the number of labels in the dictionary, for instance tokens=(0,2,2,0) means: 0 tokens placed at label 0, 2 tokens placed at label 1, etc.
Definition at line 292 of file tokenconfiguration.h.
Referenced by applyTag(), applyText(), createRoleList(), forceChildKeep(), getActiveTokens(), getActiveTokensById(), getLastActiveTokenCountFor(), hasActiveToken(), isEmpty(), isOutput(), keepSubtree(), print(), setActiveTokens(), sumUpActiveTokenCountFor(), and TokenConfiguration().
ProjectionTreeLabels * TokenConfiguration::labels [private] |
The label dictionary.
The label for token at index i can be simply retrieved by labels->getLabelById(i).
Definition at line 282 of file tokenconfiguration.h.
Referenced by applyTag(), applyText(), createRoleList(), forceChildKeep(), getLabels(), isOutput(), keepSubtree(), and TokenConfiguration().
TokenConfiguration * TokenConfiguration::parent [private] |
The parent configuration.
Is NULL for the root configuration.
Definition at line 276 of file tokenconfiguration.h.
Referenced by getLastActiveTokenCountFor(), and sumUpActiveTokenCountFor().
vector< unsigned > TokenConfiguration::passive_tokens [private] |
The second component of a TokenConfiguration are the passive tokens.
Passive tokens remember previously visited states, which are important when descendant axes are involved: in this case, we are interested in descendants starting from a given config in every following depth.
Definition at line 302 of file tokenconfiguration.h.
Referenced by addPassiveTokens(), applyTag(), applyText(), forceChildKeep(), getPassiveTokens(), getPassiveTokensById(), isEmpty(), keepSubtree(), print(), setPassiveTokens(), and TokenConfiguration().