Grobid

From BITPlan ceur-ws Wiki
Jump to navigation Jump to search

Script

#!/bin/bash
# WF 2020-08-04
# call grobid service with paper from ceur-ws
v=2644
p=44
vol=Vol-$v
pdf=paper$p.pdf
if [ ! -f $pdf ]
then
  wget http://ceur-ws.org/$vol/$pdf
else
  echo "paper $p from volume $v already downloaded" 
fi
curl -v --form input=@./$pdf http://grobid.bitplan.com/api/processFulltextDocument > $p.tei

Docker service

docker pull lfoppiano/grobid:0.6.1
docker run -t --rm -p 8070:8070 --init lfoppiano/grobid:0.6.1

Debug

docker ps | grep -B2 grobid
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS                   PORTS                           NAMES
86a41ffef83a        lfoppiano/grobid:0.6.1   "/tini -s -- ./grobi…"   3 minutes ago       Up 2 minutes                                             heuristic_turing
docker exec -it heuristic_turing /bin/bash

Python client

see https://github.com/kermitt2/grobid_client_python

git clone https://github.com/kermitt2/grobid_client_python

Test

date;python grobid_client.py --input /hd/luxio/CEUR-WS/www processFulltextDocument;date
Thu 07 Jan 2021 08:22:16 PM CET
GROBID server is up and running
...
www2004-weesa.pdf
Traceback (most recent call last):
  File "grobid_client.py", line 208, in <module>
    force=force)
  File "grobid_client.py", line 66, in process
    print(filename)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc96' in position 21: surrogates not allowed
Thu 07 Jan 2021 11:40:18 PM CET

find . -name "*.tei.xml" | grep Vol| wc -l
17714
find . -name "*.pdf" | grep Vol| wc -l
53133

Restart

runtime: 23631.56 seconds 
Fri 08 Jan 2021 02:15:14 PM CET

After fixing https://github.com/kermitt2/grobid_client_python/issues/18 and restart:

52347/53133=98.5 % of all CEUR-WS PDF files where processed by GROBID to .tei.xml files

The first import attempt ran from 2021-01-07 19:18 - 22:40 = 3 h 22 min and processed 17.714 PDF files

the restart ran from 2021-01-08 06:38 - 13:15 = 6 h 37 min and processed to 52347 files - taking a bit to "catch up" when trying the first 17.714 files and finding out the files already had been processed (estimated at 15 min from manual observation)

Total: 9 h 59 min The processing was therefore faster than 5000 PDFs / h

Test with xq

https://pypi.org/project/yq/

Examples

Vol 2644 paper 44

<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 /usr/local/src/grobid/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">One-Shot Rule Learning for Challenging Character Recognition</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Dany</forename><surname>Varghese</surname></persName>
							<email>dany.varghese@surrey.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Surrey</orgName>
								<address>
									<settlement>Guildford</settlement>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Alireza</forename><surname>Tamaddoni-Nezhad</surname></persName>
							<email>a.tamaddoni-nezhad@surrey.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">University of Surrey</orgName>
								<address>
									<settlement>Guildford</settlement>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">One-Shot Rule Learning for Challenging Character Recognition</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
				</biblStruct>
			</sourceDesc>
		</fileDesc>

		<encodingDesc>
			<appInfo>
				<application version="0.6.1-SNAPSHOT" ident="GROBID-SDO" when="2020-08-04T06:58+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid-sdo"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>One-Shot Learning· Rule-Based Machine Learning· Induc- tive Logic Programming (ILP) · Malayalam Character Recognition · Computer Vision</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Unlike most of computer vision approaches which depend on hundreds or thousands of training images, humans can typically learn from a single visual example. Humans achieve this ability using background knowledge. Rule-based machine learning approaches such as Inductive Logic Programming (ILP) provide a framework for incorporating domain specific background knowledge. These approaches have the potential for human-like learning from small data or even one-shot learning, i.e. learning from a single positive example. By contrast, statistics based computer vision algorithms, including Deep Learning, have no general mechanisms for incorporating background knowledge. In this paper, we present an approach for one-shot rule learning called One-Shot Hypothesis Derivation (OSHD) which is based on using a logic program declarative bias. We apply this approach to the challenging task of Malayalam character recognition. This is a challenging task due to spherical and complex structure of Malayalam hand-written language. Unlike for other languages, there is currently no efficient algorithm for Malayalam handwritten recognition. We compare our results with a state-of-the-art Deep Learning approach, called Siamese Network, which has been developed for one-shot learning. The results suggest that our approach can generate human-understandable rules and also outperforms the deep learning approach with a significantly higher average predictive accuracy.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>computer vision learning algorithms. For example, it is easy to produce images that are completely unrecognizable to humans, though DNN visual learning algorithms believe them to be recognizable objects with over 99% confidence <ref type="bibr" target="#b23">[24]</ref>.</p><p>Another major difference is related to the number of required training examples. Humans can typically learn from a single visual example <ref type="bibr" target="#b9">[10]</ref>, unlike statistical learning which depends on hundreds or thousands of images. Humans achieve this ability using background knowledge, which plays a critical role. By contrast, statistics based computer vision algorithms have no general mechanisms for incorporating background knowledge.</p><p>Computer vision is a multidisciplinary field that aims to create high-level understanding from digital images or videos. The key intention of image analysis is to bridge the semantic gap between low-level descriptions of an image and the high level concept within the image. The main objective of structural pattern analysis is to present the visual data using natural descriptions. Traditionally, this is achieving by extracting low-level visual cues from the data provided, then applying some grouping algorithm to express relationships that are then transformed into more and more complex and convoluted features that generate higher level rules <ref type="bibr" target="#b15">[16]</ref>.</p><p>Document image analysis approaches such as Optical Character Recognition (OCR) are an important part of visual artificial intelligence with many real-world applications. The main objective of these approaches is to identify significant graphical properties from images. In this context, Symbol Recognition has a long history dating back to the 70's. In the current state-of-the-art, symbol recognition involves identifying isolated symbols, however this is not enough for some more challenging real-world application. As an example, consider an application where the visual data is represented as a combination of isolated symbols as well as composite symbols that are connected with other graphical elements. Then the statistical approaches which represent shapes only as low level features will have limited success.</p><p>In this paper, we present an approach for one-shot rule learning called One-Shot Hypothesis Derivation (OSHD) which is based on using a logic program declarative bias. We apply this approach to the challenging task of Malayalam character recognition. This is a challenging task due to spherical and complex structure of Malayalam hand-written language. Unlike for other languages, there is currently no efficient algorithm for Malayalam hand-written recognition. The language scripts are mainly based on circular geometrical properties. We have created a dataset for Malayalam hand-written characters which includes high level properties of the language based on 'Omniglot' dataset designed for developing human-level concept learning algorithms <ref type="bibr" target="#b10">[11]</ref>. We compare our results with a state-of-the-art Deep Learning approach, called Siamese Network <ref type="bibr" target="#b7">[8]</ref>, which has been developed for one-shot learning.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Inductive Logic Programming (ILP) and One-Shot Hypothesis Derivation (OSHD)</head><p>Inductive Logic Programming (ILP) has been defined as the intersection of inductive learning and logic programming <ref type="bibr" target="#b20">[21]</ref>. Thus, ILP employs techniques from both machine learning and logic programming. The main objective of ILP, in its simplest form, is to discover the definition of a predicate by observing positive and negative examples of that predicate. Together with positive and negative examples of the target predicate, other background information may also be provided containing further information relevant to learning the target predicate. This background information is represented as a logic program and is called background knowledge(BK). ILP systems develop predicate descriptions from examples and background knowledge. The examples, background knowledge and final descriptions are all described as logic programs.</p><p>The logical notations and foundations of ILP can be found in <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b24">25]</ref>. The following definition, adapted from <ref type="bibr" target="#b24">[25]</ref>, defines the learning problem setting for ILP. In this definition, |= represents logical entailment and 2 represents an empty clause or logical refutation. Note that in practice, due to the noise in the training examples, the completeness and consistency conditions are usually relaxed. For example, weak consistency is usually used and a noise threshold is considered which allows H to be inconsistent with respect to a certain proportion (or number) of negative examples.</p><p>The following example is adapted from <ref type="bibr" target="#b12">[13]</ref>. </p><formula xml:id="formula_0">parent(X, Y ) ← mother(X, Y ), parent(X, Y ) ← f ather(X, Y )}</formula><p>Then both theories H 1 and H 2 defined as follows:</p><formula xml:id="formula_1">H 1 = {daughter(X, Y ) ← f emale(X), parent(Y, X)} H 2 = {daughter(X, Y ) ← f emale(X), mother(Y, X), daughter(X, Y ) ← f emale(X), f ather(Y, X)}</formula><p>are complete and consistent with respect to B and E.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">One-Shot Hypothesis Derivation (OSHD)</head><p>In this paper we adopt a form of ILP which is suitable for one-shot learning and is based on using a logic program declarative bias, i.e. using a logic program to represent the declarative bias over the hypothesis space. Using a logic program declarative bias has several advantages. Firstly, a declarative bias logic program allows us to easily port bias from one problem to another similar problem (e.g. for transfer learning). Secondly, it is possible to reason about the bias at the metalevel. Declarative bias will also help to reduce the size of the search space for the target concept or hypothesis derivation <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b21">22]</ref>. We refer to this approach as One-Shot Hypothesis Derivation (OSHD) which is a special case of Top-Directed Hypothesis Derivation (TDHD) as described in <ref type="bibr" target="#b17">[18]</ref>.</p><p>Definition 2 (One-Shot Hypothesis Derivation). The input to an OSHD system is the vector S T DHD = N T, , B, E, e where N T is a set of "nonterminal" predicate symbols, is a logic program representing the declarative bias over the hypothesis space, B is a logic program representing the background knowledge and E is a set of examples and e is a positive example in E. The following three conditions hold for clauses in : (a) each clause in must contain at least one occurrence of an element of N T while clauses in B and E must not contain any occurrences of elements of N T , (b) any predicate appearing in the head of some clause in must not occur in the body of any clause in B and (c) the head of the first clause in is the target predicate and the head predicates for other clauses in must be in N T . The aim of a OSHD learning system is to find a set of consistent hypothesised clauses H, containing no occurrence of N T , such that for each clause h ∈ H the following two conditions hold:</p><formula xml:id="formula_2">|= h (1) B, h |= e<label>(2)</label></formula><p>The following theorem is a special case of Theorem 1 in <ref type="bibr" target="#b17">[18]</ref>. <ref type="formula">1</ref>and <ref type="formula" target="#formula_2">2</ref>hold only if there exists an SLD refutation R of ¬e from , B, such that R can be re-ordered to give R = D h R e where D h is an SLD derivation of a hypothesis h for which (1) and (2) hold.</p><formula xml:id="formula_3">Theorem 1. Given S OSHD = N T, , B, E, e assumptions</formula><p>According to Theorem 1, implicit hypotheses can be extracted from the refutations of e. Let us now consider a simple example. </p><formula xml:id="formula_4">G2 =← property1(a) G1 =← $body(a) ¬e =← alphabet(a) 1 = alphabet(X) ← $body(X) 2 = $body(X) ← property1(X) b1 = property1(a) ←</formula><formula xml:id="formula_5">N T = {$body} B = b 1 = property1(a) ← e = alphabet(a) ← =        1 : alphabet(X) ← $body(X) 2 : $body(X) ← property1(X) 3 : $body(X) ← property2(X)</formula><p>Given the linear refutation, R = ¬e, 1 , 2 , b 1 , as shown in <ref type="figure" target="#fig_0">Figure 1</ref>, we now construct the re-ordered refutation R = D h R e where D h = 1 , 2 derives the clause h = alphabet(X) ← property1(X) for which (1) and (2) hold.</p><p>The user of OSHD can specify a declarative bias in the form of a logic program. A general theory can be also generated from user specified mode declarations. Below is a simplified example of user specified mode declarations and the automatically constructed theory.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">The OSHD Learning Algorithm</head><p>The OSHD Learning algorithm can be described in 3 main steps:</p><p>1. Generate all hypotheses, H e that are generalizations of e modeh(alphabet(+image)). modeb(has prop1(+image)). modeb(has prop2(+image)).</p><formula xml:id="formula_6">=              1 : alphabet(X) ← $body(X).</formula><p>2 : $body(X) ← .%emptybody <ref type="bibr" target="#b2">3</ref> : $body(X) ← has prop1(X), $body(X). <ref type="bibr" target="#b3">4</ref> : $body(X) ← has prop2(X), $body(X). In step 1, H e is generated using the OSHD hypothesis derivation described earlier in this section.</p><p>The second step of the algorithm, computing the coverage of each hypothesis, is not needed if the user program is a pure logic program (i.e. all relationships in the background knowledge are self contained and do not rely on Prolog built in predicates). This is because, by construction, the OSHD hypothesis derivation generates all hypotheses that entail a given example with respect to the user supplied mode declarations. This implies that the coverage of an hypothesis is exactly the set of examples that have it as their generalization. However, this coverage computation step is needed for the negative examples, as they were not used to build the hypothesis set.</p><p>For step 3, the compression-based evaluation function used for the experiments in this paper is:</p><formula xml:id="formula_7">Covered Examples W eight − T otal Literals<label>(3)</label></formula><p>The weight associated to an example may be defined by the user but by </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Siamese Neural Networks</head><p>In this paper, we use a state-of-the-art Deep Learning approach, called Siamese Network <ref type="bibr" target="#b7">[8]</ref>, which has been developed for one-shot learning. The original Siamese Networks were first introduced in the early 1990s by Bromley and LeCun to solve signature verification as an image matching problem <ref type="bibr" target="#b4">[5]</ref>. A Siamese network is a Deep Learning architecture with two parallel neural networks with the same properties in terms of weight, layers etc. Each network takes a different input, and whose outputs are combined using energy function at the top to provide some prediction. The energy function computes some metric between the highest level feature representation on each side ( <ref type="figure" target="#fig_4">Figure 3</ref>). Weight tying guarantees that two extremely similar images could not possibly be mapped by their respective networks to very different locations in feature space because each network computes the same function. Also, the network is symmetric, so that whenever we present two distinct images to the twin networks, the top conjoining layer will compute the same metric as if we were to present the same two images but to the opposite twin.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">One-Shot learning for Malayalam character recognition</head><p>We apply One-Shot Hypothesis Derivation (OSHD) as well as Deep Learning (i.e. Siamese Network) to the challenging task of one-shot Malayalam character recognition. This is a challenging task due to spherical and complex structure of Malayalam hand-written language.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Character recognition and human-like background knowledge</head><p>Malayalam is one of the four major languages of the Dravidian language family and originated from the ancient Brahmi script. Malayalam is the official language of Kerala, a state of India with roughly forty-five million people. Unlike for other languages, there is currently no efficient algorithm for Malayalam hand-written recognition. The basic Malayalam characters along with International Phonetic Alphabet (IPA) are shown in <ref type="figure" target="#fig_6">Figure 4</ref>   We selected the hand-written characters from 'Omniglot' dataset <ref type="bibr" target="#b10">[11]</ref>. Sample Malayalam alphabets from our dataset are shown in <ref type="figure" target="#fig_7">Figure 5</ref>. Feature extraction is conducted utilizing a set of advanced geometrical features <ref type="bibr" target="#b26">[27]</ref> and directional features.</p><p>Geometrical Features Every character may be identified by its geometric designations such as loops, junctions, arcs, and terminals. Geometrically, loop means a closed path. Malayalam characters contain more intricate loops which  <ref type="figure" target="#fig_8">Figure 6</ref>(b). If the figure has a continuous closed curve then we will identify it as a loop. Junctions may be defined as a meeting point of two or more curves or line. It is easy for human to identify the junction from an image as shown in <ref type="figure" target="#fig_8">Figure 6</ref>(c). As per dictionary definitions, an arc is a component of a curve. So in our case, a path with semi opening will be considered as an arc. Please refer to <ref type="figure" target="#fig_8">Figure 6(d)</ref> for more details. Terminals may be classified as points where the character stroke ends, i.e. no more connection beyond that point. <ref type="figure" target="#fig_8">Figure 6</ref>(e) is a self-explanatory example for the definition.</p><p>We have included the visual explanation for the geometrical feature extraction in <ref type="figure" target="#fig_8">Figure 6</ref>. We have selected two characters to explicate the features as shown in <ref type="figure" target="#fig_7">Figure 5</ref> and marked each geometrical features as we discussed. <ref type="table" target="#tab_1">Table  1</ref> will give an abstract conception about the dataset we have developed for the experiments from 'Omniglot' dataset.</p><p>Directional Features Every character may be identified by its directional specifications such as starting and ending points of the stroke. There are certain unwritten rules for Malayalam characters, e.g. it always commences from left and moves towards the right direction. Native Malayalam users can easily identify the starting and ending point. However, we will need to consider the starting and ending point as features so that these can be easily identified without semantic knowledge of a character. The starting and ending points are determined by standard direction properties as shown in the <ref type="figure" target="#fig_8">Figure 6</ref>(a). <ref type="figure" target="#fig_8">Figure 6</ref>(f, g) will give you an idea about developing the directional features from an alphabet. Character ID:13 is the corresponding entry for the character shown in figure 6(g). As we discussed, a user can identify both starting and ending point of the character displayed in <ref type="figure" target="#fig_8">Figure 6</ref>(g) easily whereas the terminus point of <ref type="figure" target="#fig_8">Figure  6</ref>(f) is arduous to determine.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Mode declarations and background knowledge representation</head><p>In this section, we define the OSHD specific details of the declarative bias, defined by mode declaration and background knowledge representation used in our experiments. The first step was to develop and represent the background knowledge based on the concepts described in Section 4.1. <ref type="table" target="#tab_1">Table 1</ref> shows the geometrical and directional features of 18 characters from 5 different alphabets used in our experiments.</p><p>Mode declaration and declarative bias In this section we describe how the declarative bias for the hypothesis space was defined using mode declarations. Here, we use the same notations used in Progol <ref type="bibr" target="#b16">[17]</ref> and Toplog <ref type="bibr" target="#b17">[18]</ref>. There are two types of mode declarations.</p><p>1. modeh : defines the head of a hypothesised rule. 2. modeb : defines the literals (conditions) that may appear in the body of a hypothesised rule.</p><p>For example, in our experiments, alphabet(+character) is the head of the hypothesis, where +character defines the character identifier character as an  The meaning of each modeb condition is defined as follows:</p><p>has gemproperties/2 predicate was used to represent the geometrical features as defined in <ref type="table" target="#tab_1">Table 1</ref>. The input argument character is the unique identifier for an alphabet, properties refers to the property name. has gemproperties count/3 predicate outlines the count of the particular feature associated with the alphabet. The properties is the unique identifier for a particular geometrical property of a particular alphabet , geo f eature name refers to the property name and int stands for the feature count. has dirproperties/2 predicate used to represent the directional features mentioned in table 1. The character is the unique identifier for the alphabet, properties refers to the property name. has dirproperties count/3 predicate outlines the count of a particular directional feature associated with the alphabet. The properties is a unique identifier for a particular property of a particular alphabet , dir f eature name refers to the property name and f eaturevalue stands for the feature vale.</p><p>Background knowledge representation As defined in Definition 1, background knowledge is a set of clauses representing the background knowledge about a problem. In general, background knowledge can be represented as a logic program and could include general first-order rules. However, in this paper we only consider ground fact background knowledge. In the listing 1.2 we have a sample background knowledge for an alphabet. %% G e o m e t r i c a l F e a t u r e ' Terminals ' with f e a t u r e count h a s g e m p r o p e r t i e s ( c h a r a c t e r 0 , t e r m i n a l s 0 ) . h a s g e m p r o p e r t i e s c o u n t ( t e r m i n a l s 0 , t e r m i n a l s , 2 ) . %% D i r e c t i o n a l F e a t u r e ' S t a r t i n g Point ' with f e a t u r e h a s d i r p r o p e r t i e s ( c h a r a c t e r 0 , s t a r t s 0 ) . h a s d i r p r o p e r t i e s f e a t u r e ( s t a r t s 0 , s t a r t s a t , sw ) .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Experiments</head><p>In this section we evaluate the OSHD approach for complex character recognition as described in this paper. We also compare the performance of OSHD with a state-of-the-art Deep Leaning architecture for one-shot learning, i.e. the Siamese Network approach described in Section 4. In particular we test the following null hypotheses:</p><p>Null Hypothesis H1 OSHD cannot outperform Siamese Networks in one-shot learning for complex character recognition. Null Hypothesis H2 OSHD cannot learn human comprehensible rules for complex character recognition.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Materials and Methods</head><p>The OSHD algorithm in this experiment is based on Top-Directed Hypothesis Derivation implemented in Toplog <ref type="bibr" target="#b17">[18]</ref>, and uses mode declarations and background knowledge which defined earlier in this paper. The Siamese Network used in the experiment is based on the implementation described in <ref type="bibr" target="#b7">[8]</ref>.</p><p>The challenging Malayalam character recognition dataset and the machine learning codes, configurations and input files are available from: https://github.com/danyvarghese/One-Shot-ILP</p><p>We have selected 5 complex alphabets from the 'Omniglot' dataset <ref type="bibr" target="#b10">[11]</ref>. Example characters used in our experiment and their visual properties (developed using the geometrical and directional concepts, as discussed in Section 4) are listed in <ref type="table" target="#tab_2">Table 2</ref>. We have endeavoured to reiterate the same concept of working in both architectures and repeated the experiments for different number of folds and each fold consists of single positive example and n negative examples, where n varies from 1 to 4 and the negative examples are selected from other alphabets. In our experiment we are using the term 'number of classes' in different aspect. The number of classes is defined by the total number of examples (i.e. 1 positive and n negative) used for the cross-validation.</p><p>In the following we define specific parameter settings for each algorithm.</p><p>OSHD parameter settings The following Toplog parameter settings were used in this experiment.</p><p>clause length (value = 15) defines the maximum number of literals (including the head) of a hypothesis. weight the weight of negative example is taken always as the default value. The weight of positive example is the number of negative examples for that class. During the cross-validation test, we add one more positive example of the same alphabet for each fold. The weight of newly added example will not be greater than the previous one included in the same fold.  -Convolution with 64 (10 × 10) filters uses 'relu' activation function.</p><p>-'max pooling' convolution 128 (7 × 7) filters with 'relu' activation function.</p><p>-'max pooling' convolution 128 (4 × 4) filters with 'relu' activation function.</p><p>-'max pooling' convolution 256 (4 × 4) filters with 'relu' activation function.</p><p>The twin networks reduce their inputs down to smaller and smaller 3D tensors. Finally, there is a fully connected layer with 4096 units. In most of the implementations of Siamese Network, they are trying to develop the training model from a high amount of data. Also, particularly in the case of character recognition, they compare a character from a language against the characters from other languages <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b3">4]</ref>. In our experiment we have only considering alphabets from a single language. with an average difference of more than 20%. In this figure the random curve represents the default accuracy of random guess. The accuracy for one class prediction is always 100%. Null hypothesis H1 is therefore refuted by this experiment. A better predictive accuracy of OSHD compared to the Siamese Net could be explained by the fact that it uses background knowledge. <ref type="table" target="#tab_3">Table 3</ref> shows example of learned rules by OSHD generated from one positive and two negative examples. One can easily differentiate alphabet 'Aha' against 'Eh' &amp; 'Uh'. The unique properties of 'Ah' from others alphabets is given in the column 'Human Interpretations', which is almost similar to the learned rule in column 4. It is also clear that the rule in column 4 is human comprehensible. Null hypothesis H2 is therefore refuted by this experiment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Results and Discussions</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion</head><p>In this paper, we presented a novel approach for one-shot rule learning called One-Shot Hypothesis Derivation (OSHD) which is based on using a logic program declarative bias. We applied this approach to the challenging task of Malayalam character recognition. This is a challenging task due to spherical and complex structure of Malayalam hand-written language. Unlike for other languages, there is currently no efficient algorithm for Malayalam hand-written recognition.  The features used to express the background knowledge were developed in such a way that it is acceptable for human visual cognition also. We could learn rules for each character which is more natural and visually acceptable. We compared our results with a state-of-the-art Deep Learning approach, called Siamese Network, which has been developed for one-shot learning. The results suggest that our approach can generate human-understandable rules and also outperforms the deep learning approach with a significantly higher average predictive accuracy (an increase of more than 20% in average). Its was clear from the results that deep learning paradigm use more data and its efficiency is less when dealing with a small amount of data. As future work we would like to further extend the background knowledge to include more semantic information. We will also explore the new framework of Meta-Interpretive Learning (MIL) <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20]</ref> in order to learn recursive rules.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Example 1 .</head><label>1</label><figDesc>In Definition 1, let E + , E − and B be defined as follows: E + = {daughter(mary, ann), daughter(eve, tom)} E − = {daughter(tom, ann), daughter(eve, ann)} B = {mother(ann, mary), mother(ann, tom), f ather(tom, eve), f ather(tom, ian), f emale(ann), f emale(mary), f emale(eve), male(pat), male(tom),</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 :</head><label>1</label><figDesc>SLD-refutation of ¬e Example 2. Let S OSHD = N T, , B, E, e where N T , B , e and are as follows:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 :</head><label>2</label><figDesc>Mode declarations and a theory automatically constructed from it 2. Compute the coverage of each hypothesis in H e 3. Build final theory, T , by choosing a subset of hypothesis in H e that maximises a given score function (e.g. compression)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>default, positive examples have weight 1 and negative examples weight -1. In general, negative examples are defined with a weight smaller than 0 and positive examples with a weight greater than 0.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 3 :</head><label>3</label><figDesc>A simple 2 hidden layer Siamese Neural Network<ref type="bibr" target="#b7">[8]</ref> </figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>1 .</head><label>1</label><figDesc>The handwriting recognition for Malayalam script is a major challenge compared to the recognition of other scripts because of the following reasons: -Presence of large number of alphabets -Different writing styles -Spherical features of alphabets (a) Malayalam Vowels (b) Special Consonants (Chill) (c) Consonants and Consonant Clusters</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Fig. 4 :</head><label>4</label><figDesc>Sample Malayalam characters -Similarity in character shapes</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Fig. 5 :</head><label>5</label><figDesc>Sample Malayalam characters from our dataset may contain some up and downs within the loops itself. So we follow a concept as shown in</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Fig. 6 :</head><label>6</label><figDesc>Human-like feature extraction criteria</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Listing 1 . 1 :</head><label>11</label><figDesc>Mode declarations :− modeh ( 1 , a l p h a b e t (+ c h a r a c t e r ) ) . :− modeb ( * , h a s g e m p r o p e r t i e s (+ c h a r a c t e r ,− p r o p e r t i e s ) ) . :− modeb ( * , h a s g e m p r o p e r t i e s c o u n t (+ p r o p e r t i e s , #g e o f e a t u r e n a m e ,# i n t ) ) . :− modeb ( * , h a s d i r p r o p e r t i e s (+ c h a r a c t e r ,− p r o p e r t i e s ) ) . :− modeb ( * , h a s d i r p r o p e r t i e s f e a t u r e (+ p r o p e r t i e s , #d i r f e a t u r e n a m e ,# f e a t u r e v a l u e ) ) .</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Listing 1 . 2 :</head><label>12</label><figDesc>Sample background knowledge for alphabet 'Aha' %% G e o m e t r i c a l F e a t u r e ' Loops ' with f e a t u r e count h a s g e m p r o p e r t i e s ( c h a r a c t e r 0 , l o o p s 0 ) . h a s g e m p r o p e r t i e s c o u n t ( l o o p s 0 , l o o p s , 2 ) . %% G e o m e t r i c a l F e a t u r e ' Arcs ' with f e a t u r e count h a s g e m p r o p e r t i e s ( c h a r a c t e r 0 , a r c s 0 ) . h a s g e m p r o p e r t i e s c o u n t ( a r c s 0 , a r c s , 3 ) . %% G e o m e t r i c a l F e a t u r e ' J u n c t i o n s ' with f e a t u r e count h a s g e m p r o p e r t i e s ( c h a r a c t e r 0 , j u n c t i o n s 0 ) . h a s g e m p r o p e r t i e s c o u n t ( j u n c t i o n s 0 , j u n c t i o n s , 4 ) .</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_11"><head></head><label></label><figDesc>positive example inf lation (value = 10) multiplies the weights of all positive examples by this factor. negative example inf lation (value = 5) multiplies the weights of all negative examples by this factor.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_12"><head>Figure 7 Fig. 7 :</head><label>77</label><figDesc>shows the average predictive accuracy of ILP (OSHD) vs Deep Learning (Siamese Net) in One shot character recognition with increasing number of character classes. According to this figure, OSHD outperforms the Siamese Nets, Average Predictive accuracy of ILP (OSHD) vs Deep Learning (Siamese Net) in One shot character recognition with increasing number of character classes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>Definition 1 (ILP problem setting). Input : Given B, E , where B is a set of clauses representing the background knowledge and E is the set of positive (E + ) and negative (E − ) examples such that B |= E + . Output : find a theory H such that H is complete and (weakly) consistent with respect to B and E. H is complete with respect to B and E + if B ∧ H |= E + . H is consistent with respect to B and E − if B ∧ H ∧ E − |= 2. H is weakly consistent with respect to B if B ∧ H |= 2.</figDesc><table /><note></note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 :</head><label>1</label><figDesc>Geometrical and Directional Properties</figDesc><table><row><cell>Character ID</cell><cell></cell><cell cols="2">Geometrical Properties</cell><cell></cell><cell cols="2">Directional Properties</cell></row><row><cell cols="7">No. Loops No. Junctions No. Arcs No. Terminals Starting Point Ending Point</cell></row><row><cell>1</cell><cell>2</cell><cell>4</cell><cell>3</cell><cell>2</cell><cell>sw</cell><cell>null</cell></row><row><cell>2</cell><cell>3</cell><cell>4</cell><cell>3</cell><cell>2</cell><cell>sw</cell><cell>null</cell></row><row><cell>3</cell><cell>3</cell><cell>4</cell><cell>3</cell><cell>2</cell><cell>sw</cell><cell>null</cell></row><row><cell>4</cell><cell>1</cell><cell>2</cell><cell>3</cell><cell>2</cell><cell>null</cell><cell>se</cell></row><row><cell>5</cell><cell>1</cell><cell>3</cell><cell>3</cell><cell>2</cell><cell>nw</cell><cell>se</cell></row><row><cell>6</cell><cell>0</cell><cell>1</cell><cell>3</cell><cell>2</cell><cell>nw</cell><cell>se</cell></row><row><cell>7</cell><cell>1</cell><cell>1</cell><cell>2</cell><cell>1</cell><cell>null</cell><cell>se</cell></row><row><cell>8</cell><cell>1</cell><cell>1</cell><cell>2</cell><cell>2</cell><cell>nw</cell><cell>se</cell></row><row><cell>9</cell><cell>1</cell><cell>1</cell><cell>2</cell><cell>1</cell><cell>null</cell><cell>se</cell></row><row><cell>10</cell><cell>3</cell><cell>3</cell><cell>4</cell><cell>1</cell><cell>null</cell><cell>se</cell></row><row><cell>11</cell><cell>4</cell><cell>4</cell><cell>4</cell><cell>0</cell><cell>null</cell><cell>null</cell></row><row><cell>12</cell><cell>3</cell><cell>4</cell><cell>3</cell><cell>0</cell><cell>null</cell><cell>null</cell></row><row><cell>13</cell><cell>1</cell><cell>1</cell><cell>2</cell><cell>2</cell><cell>sw</cell><cell>se</cell></row><row><cell>14</cell><cell>1</cell><cell>1</cell><cell>1</cell><cell>2</cell><cell>nw</cell><cell>se</cell></row><row><cell>15</cell><cell>1</cell><cell>1</cell><cell>1</cell><cell>2</cell><cell>sw</cell><cell>ne</cell></row><row><cell>16</cell><cell>0</cell><cell>2</cell><cell>1</cell><cell>2</cell><cell>sw</cell><cell>ne</cell></row><row><cell>17</cell><cell>0</cell><cell>0</cell><cell>1</cell><cell>2</cell><cell>sw</cell><cell>ne</cell></row><row><cell>18</cell><cell>0</cell><cell>2</cell><cell>1</cell><cell>2</cell><cell>nw</cell><cell>ne</cell></row><row><cell cols="7">input argument. We are using four predicates in the body part of the hypothesis</cell></row><row><cell cols="7">as shown in the listing 1.1. Note that +, -, indicate input, output or a constant</cell></row><row><cell>value arguments.</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table><note></note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 :</head><label>2</label><figDesc>Sample characters &amp; properties SESiamese Networks parameter settings For the implementation of the Siamese Net, we followed the same setups used by Koch et al<ref type="bibr" target="#b7">[8]</ref>. Koch et al use a convolutional Siamese network to classify pairs of 'Omniglot' images, so the twin networks are both Convolutional Neural Nets (CNNs). The twins each have the following architecture:</figDesc><table><row><cell>Alphabet</cell><cell>Properties</cell></row><row><cell></cell><cell>Loops : 2</cell></row><row><cell></cell><cell>Junctions : 4</cell></row><row><cell></cell><cell>Arcs : 3</cell></row><row><cell></cell><cell>Terminals : 2</cell></row><row><cell>Alphabet 'Aha' (ID:1)</cell><cell>Starting Point : SW</cell></row><row><cell></cell><cell>Ending Point : Null</cell></row><row><cell></cell><cell>Loops : 1</cell></row><row><cell></cell><cell>Junctions : 2</cell></row><row><cell></cell><cell>Arcs : 3</cell></row><row><cell></cell><cell>Terminals : 2</cell></row><row><cell>Alphabet 'Eh' (ID:4)</cell><cell>Starting Point : Null</cell></row><row><cell></cell><cell>Ending Point : SE</cell></row><row><cell></cell><cell>Loops : 1</cell></row><row><cell></cell><cell>Junctions : 1</cell></row><row><cell></cell><cell>Arcs : 2</cell></row><row><cell></cell><cell>Terminals : 1</cell></row><row><cell>Alphabet 'Uh' (ID:7)</cell><cell>Starting Point : Null</cell></row><row><cell></cell><cell>Ending Point :</cell></row></table><note></note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>Example of learned rules</figDesc><table><row><cell>+ve Example</cell><cell>−ve Example</cell><cell>Human Interpretations</cell><cell>Learned Rules</cell></row><row><cell></cell><cell>Alphabet 'Eh'</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Loops : 1</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Junctions : 2</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Arcs : 3</cell><cell></cell><cell></cell></row><row><cell>Alphabet 'Aha'</cell><cell>Terminals : 2</cell><cell></cell><cell></cell></row><row><cell>Loops : 2 Junctions : 4 Arcs : 3 Terminals : 2</cell><cell>Starting Point : Null Ending Point : SE Alphabet 'Uh'</cell><cell>Loops : 2 Junctions : 4 Starting Point :</cell><cell></cell></row><row><cell>Starting Point : SW</cell><cell>Loops : 1</cell><cell></cell><cell></cell></row><row><cell>Ending Point : Null</cell><cell>Junctions : 1</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Arcs : 2</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Terminals : 1</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Starting Point : Null</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Ending Point : SE</cell><cell></cell><cell></cell></row></table><note></note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1">From: https://sites.google.com/site/personaltesting1211/malayalam-alphabet.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>We would like to acknowledge Stephen Muggleton and Jose Santos for the development of Top Directed Hypothesis Derivation and Toplog <ref type="bibr" target="#b17">[18]</ref> which was the basis for One-Shot Hypothesis Derivation (OSHD) presented in this paper. We also acknowledge the Vice Chancellor's PhD Scholarship Award at the University of Surrey.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Declarative bias for specific-to-general ilp systems</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">H</forename><surname>Adé</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">L</forename><forename type="middle">D</forename><surname>Raedt</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">M</forename><surname>Bruynooghe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="119" to="154" />
			<date type="published" when="1995" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Learning Deep Architectures for AI. Found</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Trends Mach. Learn</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="127" />
			<date type="published" when="2009-01" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Contrasting advantages of learning with random weights and backpropagation in non-volatile memory neural networks</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">C</forename><forename type="middle">H</forename><surname>Bennett</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">V</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">L</forename><forename type="middle">E</forename><surname>Calvet</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Klein</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">M</forename><surname>Suri</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">M</forename><forename type="middle">J</forename><surname>Marinella</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">D</forename><surname>Querlioz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="73938" to="73953" />
			<date type="published" when="2019" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">One shot learning and siamese networks in keras</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Bouma</surname></persName>
		</author>
		<ptr target="https://sorenbouma.github.io/blog/oneshot" />
		<imprint>
			<date type="published" when="2017" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Signature verification using a &quot;siamese&quot; time delay neural network</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Bromley</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">I</forename><surname>Guyon</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">E</forename><surname>Säckinger</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>Shah</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 6th International Conference on Neural Information Processing Systems</title>
		<meeting>the 6th International Conference on Neural Information Processing Systems<address><addrLine>San Francisco, CA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Morgan Kaufmann Publishers Inc</publisher>
			<date type="published" when="1993" />
			<biblScope unit="page" from="737" to="744" />
		</imprint>
	</monogr>
	<note>NIPS&apos;93</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A novel approach for single image super resolution using statistical mathematical model</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Dany</forename><surname>Varghese</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Viju</forename><surname>Shankar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Applied Engineering Research (IJAER)</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">44</biblScope>
			<date type="published" when="2015" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Learning multiple layers of representation</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Trends in cognitive sciences</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="428" to="434" />
			<date type="published" when="2007" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Siamese neural networks for one-shot image recognition</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">G</forename><surname>Koch</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>Zemel</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 32 nd International Conference on Machine Learning</title>
		<meeting>the 32 nd International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2015" />
			<biblScope unit="volume">37</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
				<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Krig</surname></persName>
		</author>
		<title level="m">Computer Vision Metrics Survey, Taxonomy, and Analysis. Apress, Berkeley</title>
		<meeting><address><addrLine>CA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">One shot learning of simple visual concepts</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">B</forename><surname>Lake</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Gross</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Tenenbaum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 33rd Annual Conference of the Cognitive Science Society</title>
		<meeting>the 33rd Annual Conference of the Cognitive Science Society</meeting>
		<imprint>
			<date type="published" when="2011" />
			<biblScope unit="page" from="2568" to="2573" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Human-level concept learning through probabilistic program induction</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">B</forename><forename type="middle">M</forename><surname>Lake</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>Salakhutdinov</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><forename type="middle">B</forename><surname>Tenenbaum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Science</title>
		<imprint>
			<biblScope unit="volume">350</biblScope>
			<biblScope unit="issue">6266</biblScope>
			<biblScope unit="page" from="1332" to="1338" />
			<date type="published" when="2015" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">One shot learning with siamese networks using keras</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">H</forename><surname>Lamba</surname></persName>
		</author>
		<ptr target="https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d" />
		<imprint>
			<date type="published" when="2019" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Inductive Logic Programming : Techniques and Applications</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">N</forename><surname>Lavrač</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Džeroski</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1993" />
			<publisher>Ellis Horwood</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Learning hierarchical invariant spatiotemporal features for action recognition with independent subspace analysis</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">W</forename><forename type="middle">Y</forename><surname>Zou</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><forename type="middle">Y</forename><surname>Yeung</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">A</forename><forename type="middle">Y</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CVPR 2011</title>
		<imprint>
			<date type="published" when="2011" />
			<biblScope unit="page" from="3361" to="3368" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Multi-task deep neural networks for natural language understanding</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">P</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">W</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1901" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">An incremental on-line parsing algorithm for recognizing sketching diagrams</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Mas</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">G</forename><surname>Sanchez</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Llados</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">B</forename><surname>Lamiroy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ninth International Conference on Document Analysis and Recognition</title>
		<imprint>
			<date type="published" when="2007" />
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="452" to="456" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Inverse entailment and Progol</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Muggleton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">New Generation Computing</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="245" to="286" />
			<date type="published" when="1995" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">TopLog: ILP using a logic program declarative bias</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><forename type="middle">H</forename><surname>Muggleton</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">A</forename><surname>Tamaddoni-Nezhad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Logic Programming</title>
		<meeting>the International Conference on Logic Programming</meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2008" />
			<biblScope unit="volume">5366</biblScope>
			<biblScope unit="page" from="687" to="692" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Meta-interpretive learning of higher-order dyadic datalog: Predicate invention revisited</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Muggleton</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">D</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">A</forename><surname>Tamaddoni-Nezhad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">100</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="49" to="73" />
			<date type="published" when="2015" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Meta-interpretive learning from noisy images</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Muggleton</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">W</forename><forename type="middle">Z</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">C</forename><surname>Sammut</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">A</forename><surname>Tamaddoni-Nezhad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">107</biblScope>
			<date type="published" when="2018" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Inductive logic programming: Theory and methods</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><surname>Muggleton</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">L</forename><surname>De Raedt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">special Issue: Ten Years of Logic Programming</title>
		<imprint>
			<date type="published" when="1994" />
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="629" to="679" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
				<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">C</forename><surname>Nedellec</surname></persName>
		</author>
		<title level="m">Declarative bias in ilp</title>
		<imprint>
			<date type="published" when="1996" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">An incremental semi-supervised approach for visual domain adaptation</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">K</forename><forename type="middle">S</forename><surname>Neethu</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">D</forename><surname>Varghese</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2017 International Conference on Communication and Signal Processing (ICCSP)</title>
		<imprint>
			<date type="published" when="2017" />
			<biblScope unit="page" from="1343" to="1346" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Deep neural networks are easily fooled: High confidence predictions for unrecognizable images</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">A</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Yosinski</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Clune</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
		<imprint>
			<date type="published" when="2015" />
			<biblScope unit="page" from="427" to="436" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">Foundations of Inductive Logic Programming</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><forename type="middle">H</forename><surname>Nienhuys-Cheng</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>De Wolf</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1997" />
			<publisher>Springer-Verlag</publisher>
			<biblScope unit="page">1228</biblScope>
			<pubPlace>Berlin</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Intriguing properties of neural networks</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">W</forename><surname>Zaremba</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">J</forename><surname>Bruna</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">D</forename><surname>Erhan</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">I</forename><surname>Goodfellow</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">R</forename><surname>Fergus</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
		<imprint>
			<date type="published" when="2014" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Geometric feature points based optical character recognition</title>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">M</forename><surname>Usman Akram</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">Z</forename><surname>Bashir</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">A</forename><surname>Tariq</surname></persName>
		</author>
		<author>
			<persName xmlns="http://www.tei-c.org/ns/1.0"><forename type="first">S</forename><forename type="middle">A</forename><surname>Khan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2013 IEEE Symposium on Industrial Electronics Applications</title>
		<imprint>
			<date type="published" when="2013" />
			<biblScope unit="page" from="86" to="89" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>