'/(\$[a-zA-Z_]+)/', 'IRI' => '/^([a-zA-Z]+:\/\/[^\W"\']+)/', 'SPARQL_OPTIONAL' => '/(\$[a-zA-Z_]+?)_optional/', ); function eRDFT($erdf, $regexes=array(), $arc_config_path=null, $auto=true) { $this->erdf = $erdf; $this->arc_config_path = ($arc_config_path)? $arc_config_path : dirname(__FILE__).'/arc/arc_config.php'; /*Instantiate API */ $api_args = array( "inc_path"=> dirname(__FILE__)."/arc/", "config_path"=> $this->arc_config_path, ); $this->api = new ARC_api($api_args); $this->api->db_connect() or die("Couldn't connect to database"); if(!$this->api->store_exists()) $this->api->create_store() or die("Couldn't create rdfstore - check user permissions"); foreach($regexes as $k => $v) { $this->REG_EXES[$k] = $v; } if($auto) { $this->parse_erdf($erdf); $this->makeResultsets(); $this->doQueries(); $this->format_erdf(); } //else the user can make changes to the triples before processing } function makeResultsets() { $sparql_vars = $this->sparql_vars; foreach($sparql_vars as $v) { $resultsets =& $this->resultsets; if(strpos($v, '__')) // __ is the path separator { $path_keys = array_slice(explode('__', substr($v,1)), 0, -1); $path = "['".implode("']['sub_results']['", $path_keys)."']"; eval('$resultsets["sub_results"]'.$path."['sparql_vars'][] = '".$v."';"); array_push($path_keys, ''); $prefix = implode('__', $path_keys); eval('$resultsets["sub_results"]'.$path."['row_prefix'] = '{$prefix}';"); } else { $this->top_level_resultset['sparql_vars'][] = $v; } } } // parses eRDF html and extracts triples function parse_erdf() { $erdf = $this->erdf; $parser_args=array("encoding"=>"auto"); $parser=new ARC_erdf_parser($parser_args); $p_result=$parser->parse_data($this->erdf); //check if the eRDF parses if($p_result["result"] && !$p_result["error"]) { //get triples $this->triple_info = $parser->get_triple_infos(); //process triples for sparql syntax for($x=0; $x < count($this->triple_info['triples'] ); $x++) { $t = $this->triple_info['triples'][$x]; //clean s and o into sparql vars. stuff{$name} => $name foreach(array('s','o') as $k ) { $this->_format_node($t[$k], $t); } $this->processed_triples[] = $t; } } else { if($p_result["error"]) $this->errors[]=$p_result['error']; return false; } } function get_triples_by_subject($s, &$qt) { $nqt = array(); for ($i=0; $i < count($qt); $i++) { $t = $qt[$i]; if($t['s'] == $s) { $nqt[]=$t; unset($qt[$i]); } } return $nqt; } function doQueries() { //make prefix head $prefixes = ''; foreach($this->triple_info['prefixes'] as $k => $v) $prefixes.='PREFIX '.$k.': '.'<'.$v.'>'." \r\n"; $conditions = array(); $optionals = array(); foreach($this->processed_triples as $rt) { $condition = "{$rt['s']} {$rt['p_qname']} {$rt['o']} ."; if ( isset($rt['SPARQL_TRIPLE_TYPE']) && $rt['SPARQL_TRIPLE_TYPE'] == 'OPTIONAL') $optionals[] = "OPTIONAL( {$condition} )"; else $conditions[] = $condition; } $sparql = $prefixes; if(empty($conditions)) $this->errors[]='The query needs some non-optional clauses'; $conditions = implode("\t\r\n",$conditions); $optionals = implode("\t\r\n",$optionals); $whereclause = $conditions.$optionals; //top level sparql $sparql.="\r\n SELECT DISTINCT ".implode(" ", array_unique($this->top_level_resultset['sparql_vars']))."\r\n WHERE { $whereclause } LIMIT 1"; $this->top_level_resultset['sparql'] = $sparql; $data = $this->get_data($sparql); if($data[0]) foreach($data[0] as $k => $v) $this->results[$k] = $v; //get nested loops recursively $this->nested_data($this->resultsets['sub_results'], $this->results, $prefixes, $whereclause); } function nested_data(&$resultset, &$dataset, $prefixes, $whereclause) { //the $k of resultset is pluralised in dataset by appending an 's'. Better to write 'mouses' than have to think about whether 'mice' will work. foreach($resultset as $k => $v) { //build the sparql query string $sparql = $prefixes; $sparql_vars = array_unique($v['sparql_vars']); $sparql.="\r\n SELECT ".implode(" ",$sparql_vars)."\r\n WHERE { $whereclause } LIMIT 3"; $sparql = str_replace($v['row_prefix'],'', $sparql ); //get rid of this particular row prefix //eg: cd__artist becomes artist, but artist__name stays artist__name in this query $resultset[$k]['sparql'] = $sparql; //get and save results $dataset[$k.'s'] = $this->get_data($sparql, 'rows'); if(isset($resultset[$k]['sub_results']) && !empty($dataset[$k.'s'])) { //now go through each row of the current result set, making a query for any sub result sets $row_number = 0; foreach($dataset[$k.'s'] as $row) { foreach($row as $result_head => $result_value) { //now we replace the variable with the values we already retrieved for the sub query $non_var = $this->_format_node($result_value, $nothing_happens_with_this_variable); foreach($resultset[$k]['sub_results'] as $k2 => $v2) { $whereclause = str_replace('$'.$v2['row_prefix'].$result_head, $non_var, $whereclause); } $this->nested_data($resultset[$k]['sub_results'], $dataset[$k.'s'][$row_number], $prefixes, $whereclause); } $row_number++; } } } } // queries the rdf-store with the sparql query and gets back the data function get_data($sparql, $resultype='rows') { $this->sparqls[] = $sparql; $query_args=array( "result_type"=>$resultype, // (rows|json|xml|single|rows_n_count|row_count|sql) "query"=>$sparql ); $qr=$this->api->query($query_args) or die("Couldn't query datastore"); if (!empty($qr['error'])) $this->errors[] = $qr['error']; if (empty($qr['result'])) $this->errors[] = 'Query returned no data.'; //return $qr['result']; return $qr['result']; } function get_eRDF() { return $this->erdf; } function get_results() { return $this->results; } function get_sparqls() { return $this->sparqls; } function index_resources() { $tri = $this->processed_triples; if(empty($tri)) $this->errors[] = 'No Triples were processed!'; $r = array(); foreach($tri as $t) { $r[$t['s']]['props'][] = $t; $r[$t['o']]['inverse_props'][] = $t; $r[$t['s']]['all_props'][] = $t; $r[$t['o']]['all_props'][] = $t; //array_unique($r['subjects'][$t['s']]['props']); } $this->resource_indices = $r; } function _format_node(&$node, &$t) { if(preg_match($this->REG_EXES['SPARQL_VAR'], $node, $m)) { if( preg_match($this->REG_EXES['SPARQL_OPTIONAL'],$m[1],$sub_m)) { $node = str_replace('.','__',$sub_m[1]); $this->sparql_vars[]=$node; $t['SPARQL_TRIPLE_TYPE'] = 'OPTIONAL'; } else { $node = str_replace('.','__',$m[1]); $this->sparql_vars[]=$m[1]; } $t[$k.'_erdft_type'] = 'SPARQL_VAR'; } elseif($node == '') //if a node is empty, give it a var { $node = '?ERDFT'; $t[$k.'_erdft_type'] = 'SPARQL_VAR'; $this->errors[]='The query is not accurate - you need to add an id to your top level element' ; } elseif( preg_match($this->REG_EXES['IRI'], $node) ) //IRI syntax { $node = '<' . $node.'>'; $t[$k.'_erdft_type'] = 'IRI'; } else // quote literals { $node = '"'.$node.'"'; $t[$k.'_erdft_type'] = 'LITERAL'; } return $node; } function format_erdf() { $erdf &= $this->erdf; $erdf = preg_replace($this->REG_EXES['SPARQL_OPTIONAL'], '$1', $erdf); } } ?>