Get Folder size of HDFS from java - hdfs

I have to HDFS folder size which is having sub directories from java.
From command line we can use -dus option, But anyone can help me on how to get the same using java.

The getSpaceConsumed() function in the ContentSummary class will return the actual space the file/directory occupies in the cluster i.e. it takes into account the replication factor set for the cluster.
For instance, if the replication factor in the hadoop cluster is set to 3 and the directory size is 1.5GB, the getSpaceConsumed() function will return the value as 4.5GB.
Using getLength() function in the ContentSummary class will return you the actual file/directory size.

You could use getContentSummary(Path f) method provided by the class FileSystem. It returns a ContentSummary object on which the getSpaceConsumed() method can be called which will give you the size of directory in bytes.
Usage :
package org.myorg.hdfsdemo;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class GetDirSize {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
Configuration config = new Configuration();
config.addResource(new Path(
"/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
config.addResource(new Path(
"/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
FileSystem fs = FileSystem.get(config);
Path filenamePath = new Path("/inputdir");
System.out.println("SIZE OF THE HDFS DIRECTORY : " + fs.getContentSummary(filenamePath).getSpaceConsumed());
}
}
HTH

Thank you guys.
Scala version
package com.beloblotskiy.hdfsstats.model.hdfs
import java.nio.file.{Files => NioFiles, Paths => NioPaths}
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
import org.apache.commons.io.IOUtils
import java.nio.file.{Files => NioFiles}
import java.nio.file.{Paths => NioPaths}
import com.beloblotskiy.hdfsstats.common.Settings
/**
* HDFS utilities
* #author v-abelablotski
*/
object HdfsOps {
private val conf = new Configuration()
conf.addResource(new Path(Settings.pathToCoreSiteXml))
conf.addResource(new Path(Settings.pathToHdfsSiteXml))
private val fs = FileSystem.get(conf)
/**
* Calculates disk usage with replication factor.
* If function returns 3G for folder with replication factor = 3, it means HDFS has 1G total files size multiplied by 3 copies space usage.
*/
def duWithReplication(path: String): Long = {
val fsPath = new Path(path);
fs.getContentSummary(fsPath).getSpaceConsumed()
}
/**
* Calculates disk usage without pay attention to replication factor.
* Result will be the same with hadopp fs -du /hdfs/path/to/directory
*/
def du(path: String): Long = {
val fsPath = new Path(path);
fs.getContentSummary(fsPath).getLength()
}
//...
}

Spark-shell tool to show all tables and their consumption
A typical and illustrative tool for Spark-shell, looping loop over all bases, tables and partitions, to get sizes and report into a CSV file:
// sshell -i script.scala > ls.csv
import org.apache.hadoop.fs.{FileSystem, Path}
def cutPath (thePath: String, toCut: Boolean = true) : String =
if (toCut) thePath.replaceAll("^.+/", "") else thePath
val warehouse = "/apps/hive/warehouse" // the Hive default location for all databases
val fs = FileSystem.get( sc.hadoopConfiguration )
println(s"base,table,partitions,bytes")
fs.listStatus( new Path(warehouse) ).foreach( x => {
val b = x.getPath.toString
fs.listStatus( new Path(b) ).foreach( x => {
val t = x.getPath.toString
var parts = 0; var size = 0L; // var size3 = 0L
fs.listStatus( new Path(t) ).foreach( x => {
// partition path is x.getPath.toString
val p_cont = fs.getContentSummary(x.getPath)
parts = parts + 1
size = size + p_cont.getLength
//size3 = size3 + p_cont.getSpaceConsumed
}) // t loop
println(s"${cutPath(b)},${cutPath(t)},${parts},${size}")
// display opt org.apache.commons.io.FileUtils.byteCountToDisplaySize(size)
}) // b loop
}) // warehouse loop
System.exit(0) // get out from spark-shell
PS: I checked, size3 is always 3*size, no extra information.

Related

scalaFX - Titledpane: how do I get the heigth of the content?

I created a TiteldPane in scalafx
val titled: TitledPane = new TitledPane()
and put some nodes in it for my GUI.
Later I want to read out the heigth of the content of titled.
In javaFX this would be done with:
((Region) titled.getContent()).getHeight()
But if I try to read the height of the content in scala with:
titled.content.height
the height is marked as deprecated and does not compile. I've got a hint to github (scalafx/issue69) that explains why it is deprecated but does not explain how it can be done instead.
Just to clarify: I want to read out the height of the content of the titledpane, not just titled.heigth.
When titled is closed, then titled.height is 0, but I want to know what it would be when it is expanded (to detect when it has finished expanding actually).
So, how can I do this in scalafx?
EDIT:
Here is a example that shows the described error
import scalafx.Includes._
import scalafx.application.JFXApp
import scalafx.beans.property.DoubleProperty
import scalafx.beans.value.ObservableValue
import scalafx.collections.ObservableBuffer
import scalafx.event.ActionEvent
import scalafx.scene.Scene
import scalafx.scene.control.cell.TextFieldListCell
import scalafx.scene.control.{Button, ListView, TitledPane}
import scalafx.scene.layout.BorderPane
object TitledPaneEndOfExpansion extends JFXApp {
val expandedHeight = new DoubleProperty()
val data: ObservableBuffer[String] = new ObservableBuffer[String]() ++= List("some", "content", "for", "testing")
stage = new JFXApp.PrimaryStage {
title = "JavaFX: edit after rendering test"
val list: ListView[String] = new ListView[String](data) {
editable = true
cellFactory = TextFieldListCell.forListView()
height.onChange { (source: ObservableValue[Double, Number], oldValue: Number, newValue: Number) =>
expandedHeight.value = titled.content.height
println("old height is: " + oldValue.doubleValue() + " new height is: " + newValue.doubleValue())
if (newValue.doubleValue() == expandedHeight.value) {
edit(1)
}
}
}
val titled: TitledPane = new TitledPane {
text = "titled"
content = list
}
scene = new Scene {
root = new BorderPane {
center = titled
bottom = new Button() {
text = "edit cell 1"
onAction = { _: ActionEvent => list.edit(1) }
}
}
}
expandedHeight.value = titled.content.height //set to 400
list.edit(1)
}
}
And here is the buid.sbt file:
name := "JavaFXrenderingProblem"
version := "0.1"
scalaVersion := "2.13.3"
libraryDependencies += "org.scalafx" %% "scalafx" % "15.0.1-R21"
libraryDependencies += "org.controlsfx" % "controlsfx" % "8.40.18"
// Prevent startup bug in JavaFX
fork := true
// Tell Javac and scalac to build for jvm 1.8
javacOptions ++= Seq("-source", "1.8", "-target", "1.8")
scalacOptions += "-target:jvm-1.8"
scalacOptions += "-feature"
When I just compile with plain sbt i get the compile error-message:
[info] compiling 1 Scala source to ... JavaFXrenderingProblem\target\scala-2.13\classes ...
[error] ... JavaFXrenderingProblem\src\main\scala\TitledPaneEndOfExpansion.scala:38:47: value height is not a member of scalafx.beans.property.ObjectProperty[javafx.scene.Node]
[error] expandedHeight.value = titled.content.height
[error] ^
[error] one error found
[error] (Compile / compileIncremental) Compilation failed
[error] Total time: 3 s, completed 03.05.2021 11:09:02
I actually get two errors when I execute sbt run on your code, and I do not get a deprecation error:
[info] compiling 1 Scala source to C:\Users\SomeUser\src\SFC\target\scala-2.13\classes ...
[error] C:\Users\SomeUser\src\SFX\src\main\scala\TitledPaneEndOfExpansion.scala:23:41: value height is not a member of scalafx.beans.property.ObjectProperty[javafx.scene.Node]
[error] expandedHeight.value = titled.content.height
[error] ^
[error] C:\Users\MichaelAllen\src\SOSFX\src\main\scala\TitledPaneEndOfExpansion.scala:45:40: value height is not a member of scalafx.beans.property.ObjectProperty[javafx.scene.Node]
[error] expandedHeight.value = titled.content.height //set to 400
[error] ^
[error] two errors found
[error] (Compile / compileIncremental) Compilation failed
[error] Total time: 3 s, completed May 3, 2021 9:58:00 AM
From your code, the list value returns the contents of the TitledPane instance, titled, as a ListView[String]. It is this object whose height method you're trying to call. Correct?
The primary problem is that the content method of titled doesn't know enough about the type of the object that titled is storing. All it knows is that it is derived from javafx.scene.Node. Such Node instances do not have a height property, and hence your errors. (It's actually a little more complicated than that, but that's the simplest way to explain the issue.)
However, you already have a reference to the object that is the content of titled: list. So you can replace the second reference to titled.content.height with list.height. The first reference, in list's height's onChanged method, is accessible through the source parameter (it identifies the property that changed value, namely list.height in this case). So you can replace title.content.height with source in this case.
I notice that you're using a DoubleProperty type for expandedHeight in your example, but you need to keep looking at the value of the associated types. That's not very idiomatic. If you don't need this value to be reactive, a simple Double would suffice (but this would require that expandedHeight be declared as a var).
Combined, this produces the following code:
import scalafx.Includes._
import scalafx.application.JFXApp
import scalafx.beans.property.DoubleProperty
import scalafx.beans.value.ObservableValue
import scalafx.collections.ObservableBuffer
import scalafx.event.ActionEvent
import scalafx.scene.Scene
import scalafx.scene.control.cell.TextFieldListCell
import scalafx.scene.control.{Button, ListView, TitledPane}
import scalafx.scene.layout.BorderPane
object TitledPaneEndOfExpansion extends JFXApp {
var expandedHeight: Double = _
val data: ObservableBuffer[String] = new ObservableBuffer[String]() ++= List("some", "content", "for", "testing")
stage = new JFXApp.PrimaryStage {
title = "JavaFX: edit after rendering test"
val list: ListView[String] = new ListView[String](data) {
editable = true
cellFactory = TextFieldListCell.forListView()
height.onChange { (source: ObservableValue[Double, Number], oldValue: Number, newValue: Number) =>
expandedHeight = source.value
println("old height is: " + oldValue.doubleValue() + " new height is: " + newValue.doubleValue())
if (newValue.doubleValue() == expandedHeight) {
edit(1)
}
}
}
val titled: TitledPane = new TitledPane {
text = "titled"
content = list
}
scene = new Scene {
root = new BorderPane {
center = titled
bottom = new Button() {
text = "edit cell 1"
onAction = { _: ActionEvent => list.edit(1) }
}
}
}
expandedHeight = list.height.value //set to 400
list.edit(1)
}
}
Your code then compiles and runs.
Updated
ScalaFX is simply a wrapper for JavaFX: each JavaFX type has a corresponding ScalaFX type. ScalaFX provides implicit conversion functions to seamlessly convert, say, a JavaFX TitledPane to a ScalaFX TitledPane, and vice versa. However, there's no inheritance relationship between the two sets of objects. That is, a JavaFX TitledPane has no type relationship to a ScalaFX TitledPane. Casting between the two sets of objects is therefore a complicated process.
If you wanted to be able to cast titled.content correctly in order to access the height property of the contents more directly, you would need to get the property's value and explicitly pattern match on the result with the JavaFX version of the object, as follows:
import javafx.scene.control.{ListView => JFXListView}
import scalafx.Includes._
import scalafx.application.JFXApp
import scalafx.beans.property.DoubleProperty
import scalafx.beans.value.ObservableValue
import scalafx.collections.ObservableBuffer
import scalafx.event.ActionEvent
import scalafx.scene.Scene
import scalafx.scene.control.cell.TextFieldListCell
import scalafx.scene.control.{Button, ListView, TitledPane}
import scalafx.scene.layout.BorderPane
object TitledPaneEndOfExpansion extends JFXApp {
var expandedHeight: Double = _
val data: ObservableBuffer[String] = new ObservableBuffer[String]() ++= List("some", "content", "for", "testing")
stage = new JFXApp.PrimaryStage {
title = "JavaFX: edit after rendering test"
val list: ListView[String] = new ListView[String](data) {
editable = true
cellFactory = TextFieldListCell.forListView()
height.onChange { (source: ObservableValue[Double, Number], oldValue: Number, newValue: Number) =>
expandedHeight = titled.content.value match {
case lv: JFXListView[_] => lv.height.value
case _ => {
throw new RuntimeException(s"Unexpected content type: ${titled.content.getClass.getCanonicalName}")
}
}
println("old height is: " + oldValue.doubleValue() + " new height is: " + newValue.doubleValue())
if (newValue.doubleValue() == expandedHeight) {
edit(1)
}
}
}
val titled: TitledPane = new TitledPane {
text = "titled"
content = list
}
scene = new Scene {
root = new BorderPane {
center = titled
bottom = new Button() {
text = "edit cell 1"
onAction = { _: ActionEvent => list.edit(1) }
}
}
}
expandedHeight = titled.content.value match { //set to 400
case lv: JFXListView[_] => lv.height.value
case _ => throw new RuntimeException(s"Unexpected content type: ${titled.content.getClass.getCanonicalName}")
}
list.edit(1)
}
}
If you didn't have any other means of referencing the list object, that would be your only option.

Is it possible to build an OLTP/CRUD HTTP server using AkkaHttp, AkkaStreams, Alpakka and a database?

It is clear to me that using Actors of course it is possible: for instance https://github.com/chbatey/akka-http-typed.git is using AkkaHttp and typed actors.
But it is unclear to me if just using AkkaStreams and its Alpakka connectors library (which includes databases), if is it possible to do regular CRUD / OLTP services, or just data replication from one database to another, or other OLAP / batch / stream processing scenarios.
If you know how it can be done please indicate a few details and if you can provide an example on github for instance that would be great.
The way I am thinking it may be possible is that the server is involved in two conversations / stateful stream transformation: one with the outside world over HTTP, and one with the database. I am not sure if this is possible to be modelled like that.
https://doc.akka.io/docs/alpakka/current/slick.html seems to offer both UPDATE/INSERTS as a Sink as well as pointed SELECT to a certain id as a Source. Do you know if an example app is there or can you broadly mention how the wiring would happen with Akka Http?
I put a demo here, hope it can help you.
Creating table, database is mysql.
CREATE TABLE test(id VARCHAR(32))
sbt:
"com.lightbend.akka" %% "akka-stream-alpakka-slick" % "1.1.0",
"mysql" % "mysql-connector-java" % "5.1.40"
Code:
package tech.parasol.scala.crud
import java.sql.SQLException
import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.server.Directives.{complete, get, path, _}
import akka.stream.alpakka.slick.scaladsl.{Slick, SlickSession}
import akka.stream.scaladsl.Sink
import akka.stream.{ActorAttributes, ActorMaterializer, Supervision}
import com.typesafe.config.ConfigFactory
import scala.concurrent.Future
import scala.io.StdIn
import scala.util.{Failure, Success}
object CrudTest1 {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("CrudTest1")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val hostName = "120.0.0.1"
val rocketDbConfig =
s"""
|db-config {
| profile = "slick.jdbc.MySQLProfile$$"
| db {
| dataSourceClass = "slick.jdbc.DriverDataSource"
| properties = {
| driver = "com.mysql.jdbc.Driver"
| url = "jdbc:mysql://${hostName}:3306/rocket?useUnicode=true&characterEncoding=utf8&rewriteBatchedStatements=true&useSSL=false"
| user = "root"
| password = "passw0rd"
| }
| }
|}
|
""".stripMargin
implicit val session = SlickSession.forConfig("db-config", ConfigFactory.parseString(rocketDbConfig))
import session.profile.api._
def persistence(message: String) = {
def insert(message: String): DBIO[Int] = {
sqlu"""INSERT INTO test(id) VALUES (${message})"""
}
session.db.run(insert(message)).map {
case _ => message
}.recover {
case e : SQLException => {
throw new Exception("Database error ===>")}
case e : Exception => {
throw new Exception("Database error.")}
}
}
val route = path("hello" / Segment ) { name =>
get {
val res = persistence(name)
onComplete(res) {
case Success(value) => {
complete(s"<h1>Say hello to ${name}</h1>")
}
case Failure(e) => {
complete(s"<h1>Failed to say hello to ${name}</h1>")
}
}
}
}
val bindingFuture = Http().bindAndHandle(route, "localhost", 8088)
println(s"Server online at http://localhost:8088/\nPress RETURN to stop...")
StdIn.readLine() // let it run until user presses return
bindingFuture
.flatMap(_.unbind()) // trigger unbinding from the port
.onComplete(_ => system.terminate()) // and shutdown when done
}
}
Yes, basically at every request receive in AkkaHttp, we create an AkkaStreams Graph (just a pipeline typically), basically just the Slick Alpakka Source from the database, maybe prefixed by some operators, and then returned in AkkaHttp, which of course supports Source. More details at [https://www.quora.com/Is-it-possible-to-build-an-OLTP-CRUD-HTTP-server-using-Akka-HTTP-Akka-Streams-Alpakka-and-a-database-Do-you-know-any-examples-of-code-on-GitHub-or-elsewhere/answer/Nicolae-Marasoiu]

"How to fix 'Cannot resolve method "iterations and getFeatureMatrix "'?

" I'm new in neural networks and DL4j, and I want to train neural network with CSV and build linear regression. How can I fix these errors "Cannot resolve method'.iterations and getFeatureMatrix()'"?
"Previously I'm tried to do that, but have another error in 'seed'".
import org.datavec.api.records.reader.RecordReader;
import org.datavec.api.records.reader.impl.csv.CSVRecordReader;
import org.datavec.api.split.FileSplit;
import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator;
import org.deeplearning4j.nn.api.OptimizationAlgorithm;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.Updater;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.nd4j.evaluation.classification.Evaluation;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.dataset.api.DataSet;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import java.io.File;
public class Data {
public static void main(String[] args) throws Exception {
Parameters:
int seed = 3000;
int batchSize = 200;
double learningRate = 0.001;
int nEpochs = 150;
int numInputs = 2;
int numOutputs = 2;
int numHiddenNodes = 100;
Load data:
//load data train
RecordReader rr = new CSVRecordReader();
rr.initialize(new FileSplit(new File("train.csv")));
DataSetIterator trainIter = new RecordReaderDataSetIterator(rr, batchSize, 0, 2);
//load test data
RecordReader rrTest = new CSVRecordReader();
rr.initialize(new FileSplit(new File("test.csv")));
DataSetIterator testIter = new RecordReaderDataSetIterator(rrTest, batchSize, 0, 2);
Network Configuration:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.iterations(1000)
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.learningRate(learningRate)
.updater(Updater.NESTEROVS).momentum(0.9)
.list()
.layer(0, new DenseLayer.Builder()
.nIn(numInputs)
.nOut(numHiddenNodes)
.weightInit(WeightInit.XAVIER)
.activation(Activation.fromString("relu"))
.build())
.layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.weightInit(WeightInit.XAVIER)
.activation(Activation.fromString("softmax"))
.weightInit(WeightInit.XAVIER)
.nIn(numHiddenNodes)
.nOut(numOutputs)
.build()
)
.pretrain(false).backprop(true).build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
model.setListeners(new ScoreIterationListener((15)));
for (int n = 0; n < nEpochs; n++) {
model.fit((trainIter));
System.out.println(("--------------eval model"));
Evaluation eval = new Evaluation(numOutputs);
while (testIter.hasNext()) {
DataSet t = testIter.next();
INDArray features = getFeatureMatrix();
INDArray lables = t.getLabels();
INDArray predicted = model.output(features, false);
eval.eval(lables, predicted);
}
System.out.println(eval.stats());
}
}
}
Logs
Build
First you should consider to use more class (like one for the definition of the neural network, one for the training process etc, ...). Just a best practice stuff.
I do not know which version of DL4J you're using but we can notice that getFeatureMatrix() has been removed. One more thing is that this function should be called on a DataSet object and not "statically" like you seem to do. (you should do t.getFeatureMatrix()).
It is pretty same things about iterations() function of the neural network creation; This function has been removed since some DL4J releases. You can get more information about this function on this thread. Now you have to find an alternative to set up number of iteration, you can take a look at this thread. Hope it is answering your question !

mock input dstream apache spark

I am trying to mock the input dstream while writing a spark stream unit test. I am able to mock the RDD but when I am trying to convert them into dstream, dstream object is coming up empty. I have used the following code-
val lines = mutable.Queue[RDD[String]]()
val dstream = streamingContext.queueStream(lines)
// append data to DStream
lines += sparkContext.makeRDD(Seq("To be or not to be.", "That is the question."))
Any help regarding the same will be highly appreciated.
Write UT for all DataFrameWriter, DataFrameReader, DataStreamReader, DataStreamWriter
The sample test case using the above steps
Mock
Behavior
Assertion
Maven based dependencies
<groupId>org.scalatestplus</groupId>
<artifactId>mockito-3-4_2.11</artifactId>
<version>3.2.3.0</version>
<scope>test</scope>
<groupId>org.mockito</groupId>
<artifactId>mockito-inline</artifactId>
<version>2.13.0</version>
<scope>test</scope>
Let’s use an example of a spark class where the source is Hive and the sink is JDBC
class DummySource extends SparkPipeline {
/**
* Method to read the source and create a Dataframe
*
* #param sparkSession : SparkSession
* #return : DataFrame
*/
override def read(spark: SparkSession): DataFrame = {
spark.read.table("Table_Name").filter("_2 > 1")
}
/**
* Method to transform the dataframe
*
* #param df : DataFrame
* #return : DataFrame
*/
override def transform(df: DataFrame): DataFrame = ???
/**
* Method to write/save the Dataframe to a target
*
* #param df : DataFrame
*
*/
override def write(df: DataFrame): Unit =
df.write.jdbc("url", "targetTableName", new Properties())
}
Mocking Read
test("Spark read table") {
val dummySource = new DummySource()
val sparkSession = SparkSession
.builder()
.master("local[*]")
.appName("mocking spark test")
.getOrCreate()
val testData = Seq(("one", 1), ("two", 2))
val df = sparkSession.createDataFrame(testData)
df.show()
val mockDataFrameReader = mock[DataFrameReader]
val mockSpark = mock[SparkSession]
when(mockSpark.read).thenReturn(mockDataFrameReader)
when(mockDataFrameReader.table("Table_Name")).thenReturn(df)
dummySource.read(mockSpark).count() should be(1)
}
Mocking Write
test("Spark write") {
val dummySource = new DummySource()
val mockDf = mock[DataFrame]
val mockDataFrameWriter = mock[DataFrameWriter[Row]]
when(mockDf.write).thenReturn(mockDataFrameWriter)
when(mockDataFrameWriter.mode(SaveMode.Append)).thenReturn(mockDataFrameWriter)
doNothing().when(mockDataFrameWriter).jdbc("url", "targetTableName", new Properties())
dummySource.write(df = mockDf)
}
Streaming code in ref
Ref : https://medium.com/walmartglobaltech/spark-mocking-read-readstream-write-and-writestream-b6fe70761242

Missing invocation to mocked type at this point;

I am new to jMockit. I am trying to mock multiple instances of java.io.File type in a method. There are some places where, I shouldn't mock file Object. For that reason, I am using #Injectable. It is throwing the below exception.
I don't want to mock all the instances of java.io.File.I want the instances returned from the methods to be actual Files.
The below is test class.
/**
*
*/
package org.iis.uafdataloader.tasklet;
import static org.junit.Assert.fail;
import java.io.File;
import java.io.FilenameFilter;
import java.io.IOException;
import java.util.regex.Pattern;
import mockit.Expectations;
import mockit.Injectable;
import mockit.Mocked;
import mockit.NonStrictExpectations;
import mockit.VerificationsInOrder;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.filefilter.RegexFileFilter;
import org.iis.uafdataloader.tasklet.validation.FileNotFoundException;
import org.junit.Test;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.repeat.RepeatStatus;
/**
* #author K23883
*
*/
public class FileMovingTaskletTest {
private FileMovingTasklet fileMovingTasklet;
#Mocked
private StepContribution contribution;
#Mocked
private ChunkContext chunkContext;
/**
* Test method for
* {#link org.iis.uafdataloader.tasklet.FileMovingTasklet#execute(org.springframework.batch.core.StepContribution, org.springframework.batch.core.scope.context.ChunkContext)}
* .
*
* #throws Exception
*/
#Test
public void testExecuteWhenWorkingDirDoesNotExist(
// #Mocked final File file,
#Injectable final File sourceDirectory,
#Injectable final File workingDirectory,
#Injectable final File archiveDirectory,
#Mocked final RegexFileFilter regexFileFilter,
#Mocked final FileUtils fileUtils) throws Exception {
fileMovingTasklet = new FileMovingTasklet();
fileMovingTasklet.setSourceDirectoryPath("sourceDirectoryPath");
fileMovingTasklet.setInFileRegexPattern("inFileRegexPattern");
fileMovingTasklet.setArchiveDirectoryPath("archiveDirectoryPath");
fileMovingTasklet.setWorkingDirectoryPath("workingDirectoryPath");
final File[] sourceDirectoryFiles = new File[] {
new File("sourceDirectoryPath/ISGUAFFILE.D140728.C00"),
new File("sourceDirectoryPath/ISGUAFFILE.D140729.C00") };
final File[] workingDirectoryFiles = new File[] {
new File("workingDirectoryPath/ISGUAFFILE.D140728.C00"),
new File("workingDirectoryPath/ISGUAFFILE.D140729.C00") };
new NonStrictExpectations(){{
new File("sourceDirectoryPath");
result = sourceDirectory;
sourceDirectory.exists();
result = true;
sourceDirectory.isDirectory();
result = true;
// workingDirectory =
new File("workingDirectoryPath");
result = workingDirectory;
workingDirectory.exists();
result = false;
workingDirectory.mkdirs();
FileUtils.cleanDirectory(onInstance(workingDirectory));
FilenameFilter fileNameFilter = new RegexFileFilter(anyString,
Pattern.CASE_INSENSITIVE);
sourceDirectory.listFiles(fileNameFilter);
result = sourceDirectoryFiles;
System.out.println("sourceDirectoryFile :"
+ ((File[]) sourceDirectoryFiles).length);
// for (int i = 0; i < sourceDirectoryFiles.length; i++) {
// FileUtils.moveFileToDirectory(sourceDirectoryFiles[i],
// workingDirectory, true);
// }
// archiveDirectory =
new File("archiveDirectoryPath");
result = archiveDirectory;
workingDirectory.listFiles();
result = workingDirectoryFiles;
// for (int i = 0; i < workingDirectoryFiles.length; i++) {
// FileUtils.copyFileToDirectory(workingDirectoryFiles[i],
// archiveDirectory);
// }
}};
RepeatStatus status = fileMovingTasklet.execute(contribution,
chunkContext);
assert (status == RepeatStatus.FINISHED);
new VerificationsInOrder() {{
sourceDirectory.exists();
onInstance(sourceDirectory).isDirectory();
onInstance(workingDirectory).exists();
onInstance(workingDirectory).mkdirs();
onInstance(sourceDirectory).listFiles((FilenameFilter)any);
FileUtils.moveFileToDirectory((File)any, onInstance(workingDirectory), true);
times = 2;
FileUtils.copyFileToDirectory((File)any, onInstance(archiveDirectory));
times= 2;
}};
}
}
The below is actual implementation method
/*
* (non-Javadoc)
*
* #see org.springframework.batch.core.step.tasklet.Tasklet#execute(org.
* springframework.batch.core.StepContribution,
* org.springframework.batch.core.scope.context.ChunkContext)
*/
#Override
public RepeatStatus execute(StepContribution contribution,
ChunkContext chunkContext) throws Exception {
File sourceDirectory = new File(sourceDirectoryPath);
if (sourceDirectory == null || !sourceDirectory.exists()
|| !sourceDirectory.isDirectory()) {
throw new FileNotFoundException("The source directory '"
+ sourceDirectoryPath
+ "' doesn't exist or can't be read or not a directory");
}
File workingDirectory = new File(workingDirectoryPath);
if (workingDirectory != null && !workingDirectory.exists() ) {
workingDirectory.mkdirs();
}
FileUtils.cleanDirectory(workingDirectory);
FilenameFilter fileFilter = new RegexFileFilter(inFileRegexPattern,
Pattern.CASE_INSENSITIVE);
File[] sourceDirectoryFiles = sourceDirectory.listFiles(fileFilter);
System.out.println("sourceDirectoryFiles : " + sourceDirectoryFiles.length);
for (File file : sourceDirectoryFiles) {
FileUtils.moveFileToDirectory(file, workingDirectory, true);
}
File archiveDirectory = new File(archiveDirectoryPath);
for (File file : workingDirectory.listFiles()) {
FileUtils.copyFileToDirectory(file, archiveDirectory);
}
return RepeatStatus.FINISHED;
}
The below is stack trace.
java.lang.IllegalStateException: Missing invocation to mocked type at this point; please make sure such invocations appear only after the declaration of a suitable mock field or parameter
at org.iis.uafdataloader.tasklet.FileMovingTaskletTest$1.<init>(FileMovingTaskletTest.java:75)
at org.iis.uafdataloader.tasklet.FileMovingTaskletTest.testExecuteWhenWorkingDirDoesNotExist(FileMovingTaskletTest.java:71)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Please, help me in solving the problem.
#Injectable gives you a single mocked instance; it won't affect other instances of the mocked type. So, when the test attempts to record new File("sourceDirectoryPath"), it says "missing invocation to mocked type at this point" precisely because the File(String) is not mocked.
To mock the entire File class (including its constructors) so that all instances are affected, you need to use #Mocked instead, as the following example shows:
#Test
public void mockFutureFileObjects(#Mocked File anyFile) throws Exception
{
final String srcDirPath = "sourceDir";
final String wrkDirPath = "workingDir";
new NonStrictExpectations() {{
File srcDir = new File(srcDirPath);
srcDir.exists(); result = true;
srcDir.isDirectory(); result = true;
File wrkDir = new File(wrkDirPath);
wrkDir.exists(); result = true;
}};
sut.execute(srcDirPath, wrkDirPath);
}
The JMockit Tutorial describes the same mechanism, although with a slightly different syntax.
This said, I would suggest instead to write the test with real files and directories.